Registration & Discovery of Web Services

REGISTRATION + DISCOVERY / SCALING WEB SERVICES

SIX MSA PATTERNS.
The most common microservice patterns are Message-Oriented, Event-Driven, Isolated State, Replicating State, Fine-Grained (SOA), and Layered APIs.
Web Services and Business Processes were once complicated by the issue of State. By their nature, Business Processes are Stateful, in that changes occur after each step is performed. The moment of each event is measured as the clock ticks. In the early days, Web Services were always Stateless. This was slowly resolved with non-proprietary solutions such as business process Management Notation (BPMN) and Business Process Execution Language (PBEL). Yet, some Web Services execute or expose computing functions and others are executing business processes. In some cases, a clock matters and other cases, it does not.

Kristopher Sandoval sees a reason for people to be confused, noting “stateless services have managed to mirror a lot of the behavior of stateful services without technically crossing the line.” His explains, “When the state is stored by the server, it generates a session. This is stateful computing. When the state is stored by the client, it generates some kind of data that is to be used for various systems — while technically ‘stateful’ in that it references a state, the state is stored by the client so we refer to it as stateless. Sandoval writes for Nordic APIs.

STATEFUL PATTERNS AND EVI.
Traditional system design favors consistent data queries and mutating state, which is not how distributed architectures are designed. To avoid unexpected results or data corruption, a state needs to be explicitly declared or each component needs to be autonomous. Event-driven patterns provide standards to avoid side-effects of explicitly declaring a state. Message-oriented systems use a queue, while event-based also sets and enforces standards to assure that the design and behavior of messages over the queue have a timestamp. A materialized view of the state can be reconstructed by the service receiving it. It can then replay the events in order. This makes the event-based pattern ideal for EVI.

Any pattern that records time stamps is suitable. Therefore, an index of microservices should also attempt to classify whether a Service has a time-stamp, using the input of whether it is stateful as a key predictor. To start, “service discovery” is required. Enterprise Value Integration (EVI) is then able to leverage the inventory of services to track inputs, transformations, and outputs associated with each customer and each process they consume. This is vital to improve profitability, while also delivering more consistently on brand promises.
EVI is the only brand consulting firm in the world to recognize the importance of containers and serverless computing – and consistently re-engineer organizations to deliver more value.

Microsoft Azure is the second largest cloud business behind AWS, with approximately 13% of all combined Infrastructure-as-a-Service (IaaS) & Platform-as-a-Service (PaaS) market share.

SERVICE REGISTRATION.

Registries of service instances already exist. However, they typically reflect the current state of active instances. The network location is registered at the point a service is spun-up. A service is often described as ephemeral, because it is deleted from the service registry as soon as the instance is terminated.

Netflix Eureka is an example of a registry using a Representational State Transfer (REST) API. A service instance can be registered with its network location using a POST request. A PUT request refreshes its registration every 30 seconds. A client can querying service instances using HTTP GET request to access registered service instances. An instance registration may time out or an HTTP DELETE request can delete it. This illustrates the fleeting nature of services, and the importance of being able to create an index of inactive services over time.

There are two main patterns for registration: self-registration and third-party registration. Eureka is an example of a service instance registering and deregistering itself with a registry. An example of a third-party service registry is the Registrator project, which is open source. Service instances deployed as containers are automatically registered and deregistered. It is designed specifically for Docker containers and supports ETCD and Consul.

With the third-party registration pattern, the service registrar monitors and records changes to the set of running instances. It does this in one of two ways. The registry can subscribe to events or poll the deployment environment. Therefore, a master index of names of all services is possible, but requires continual monitoring to index or catalogue the names of each and begin to associate each with its state and see a pattern or network locations used over time.

Amazon Web Services (AWS) was the first-to-market and is the number one cloud services platform, offering compute power, database storage, content delivery, and additional functionality to help businesses scale and grow. Millions of customers are leverage AWS cloud products and solutions to build sophisticated applications with increased flexibility, scalability and reliability.

SERVICE DISCOVERY.

The common use case of service registration revolves around real-time discovery of locations of services on the network. It is ephemeral. Some service instances are relatively static and others are dynamic. Network locations of instances are more static in an application running on physical hardware. A system can use configuration file to identify the network locations in static environments. These files are updated periodically, and can be easily incorporated into the index of services. Nginx points out that Service instances have “dynamically assigned network locations. Moreover, the set of service instances changes dynamically because of autoscaling, failures, and upgrades. Consequently, your client code needs to use a more elaborate service discovery mechanism.” Data can be transmitted in a service-based environment in the form of a datagram or packet. In the most common usage, service discovery enables awareness of instances of each process in the cluster as it listens via a specific User Datagram Protocol (UDP) port as well as one for Transmission Control Protocol (TCP). UDP is commonly used by applications with real-time requirements. UDP can prioritizing on-time delivery of most packets, while dropping others along the way. Jeff Lindsay is a programmer and blogger. He notes that service discovery, “is increasingly gaining mindshare in mainstream system architecture. Traditionally associated with zero-configuration networking, its more modern use can be summarized as facilitating connections to dynamic, sometimes ephemeral services.” Lindsay explained that increasingly dynamic compute environments are becoming more common. Microservices and containers are important advances that also complicate service discovery. New solutions are being developed to solve new problems.

Google Cloud Console helps you deploy, scale and diagnose production issues in a simple web based interface. Search to quickly find resources and connect to instances via SSH in the browser. Handle devops workflows on the go with powerful native iOS and Android applications. Master the most complex development tasks with Google Cloud Shell, your admin machine in the cloud.

LOG FILES.

EVI takes discovery in a different direction by using past records to maintain a running inventory. Log files contain time stamp and data about internet protocol (IP) address, URL that referred user agent, access request, username, quantifies number of bytes transferred, and lists the result status. EVI aggregates each log file, which retained by the servers.

According to VMware, a virtual machine (VM) is a software computer that, like a physical machine, runs an operating system and applications. A virtual machine uses the physical resources of the physical machine on which it runs, which is called the host system. Virtual machines have virtual devices that provide the same functionality as physical hardware, but with the additional benefits of portability, manageability, and security. A VM has an operating system and virtual resources that you manage in much the same way that you manage a physical computer.

TOOLS & TECHNOLOGY IN SERVICE DISCOVERY.

On-premise legacy systems cannot easily scale and do not have the flexibility to enable rapid business model innovation. Virtual machines (VMs) opened the door for greater flexibility, not confining a company to a finite set of servers. Containers take that to an entirely new level because they can enable your organization to pack a lot more applications into a single physical server than a VM because they consume less system resources. Containers are also faster because they don’t contain a full copy of the operating system (OS) when they spin-up. VMs also use more random access memory (RAM) and central processing unit (CPU) cycles. As administrators have realized containers enable their teams to create a portable, consistent operating environment for development, testing, and deployment, the Docker and Kubernetes have become increasingly popular.

Docker is one of the leading companies and platforms for developers and sysadmins to develop, ship, and run applications in containers. Docker lets you quickly assemble applications from components and eliminates the friction that can come when shipping code.

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications offered by Google. It groups containers that make up an application into logical units for easy management and discovery.

Docker & Kubernetes. Docker has a free open source and paid enterprise versions and is considered a simpler and more flexible container storage platform than Kubernetes. Although it is also open source, it comes from Google and is considered more intricate. Some say overly-complicated. The simplicity of Docker is illustrated through service discovery. As long as the container name is used as hostname, a container can always discover other containers on the same stack. Docker Cloud uses directional links recorded in environment variables to provides a basic service discovery functionality. The complexity of Kubernetes is illustrated through service registration and discovery. Kubernetes is predicated in the idea that a Service is a REST object. This means that a Service definition can be POSTed to the apiserver to create a new instance. Kubernetes offers Endpoints API, which is updated when a Service changes. Google explains, “for non-native applications, Kubernetes offers a virtual-IP-based bridge to Services which redirects to the backend Pods.” A group of containers that deployed together on the same host is a Kubernetes pod. Kubernetes supports domain name servers (DNS) discovery as well as environmental variables. Google strongly recommends the DNS approach.

Click Here for news, more videos, and links about KUBERNETES >

Click Here for news, more videos, and links about DOCKER >

Nerve and Synapse are available on GitHub. Those of the components of SmartStack, which is an automated service discovery and registration framework.

Smart Stack. Airbnb needed a solution that addressed its service discovery needs. The hospitality broker between hosts and guests did not think that discovering DNS were sufficient for its needs. DNS suffers from propagation delays. Exact propagation delays are typically non-deterministic because various layers of caching in the DNS infrastructure. DNS cannot be used to push state, so the consumer have to poll for all changes. So, Airbnb built is own framework. It then posted it on Github in the public domain. SmartStack automates service discovery and registration. It handles creation, deletion, failure, and maintenance work of the machines running code. It has two primary components called Nerve and Synapse. Nerve tracks services in a centralized automation center and uses Zookeeper as its key-value store. Synapse discovers remote services.

Mesos-DNS provides service discovery within DC/OS clusters. It is fully integrated into DC/OS and allows applications and services on the cluster to find each other through DNS, similar to how services discover each other throughout the Internet.

Mesos. Mesos is part of the Apache project. It provides scheduling and resource management in cloud environments and data centers. The cloud promises scalability and Mesos was designed to scale to tens of thousands of nodes. One drawback of Mesos-DNS is that it is stateless, which interferes with integrations is business processes. Mesos was built at a different level of abstraction than the Linux kernel, but uses the same principles. Mesos is well-suited to provide service discovery functionality when used with Zookeeper, which is also part of Apache project.

ZooKeeper is a distributed, open-source coordination service for distributed applications, providing configuration and synchronization service for cluster computing. With open source ZooKeeper, Hadoop YARN ResourceManager (RM) is supported with high availability, and uses a data model styled after the familiar directory tree structure of file systems. HBase, Storm and other software use ZooKeeper for coordinating the cluster.

Zookeeper. When used under the proper conditions, ZooKeeper maintains configuration information, discovers services and presents them in a simple interface to a centralized coordination service. For the purpose of EVI’s Service Discovery, we do not need to deploy most of what Zookeeper does in the deployment phase. The Zookeeper blog states, “each time they [applications] are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable. Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them, which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.” Zookeeper includes key/value stores and libraries.

ETCD is a distributed key/value store that provides a way to store data across a cluster of machines. CoreOS claims it “gracefully handles leader elections during network partitions and will tolerate machine failure, including the leader.”

ETCD, Consul and More. Other service discovery tools such as ETCD, Consul, Spartan, Marathon-lb, and Minuteman include additional functionality. ETCD also has a representational state transfer (REST) application programming interface (API) and builds consensus using RAFT. This is used as part of cluster of nodes that maintain a replicated state machine, while sending and receiving messages in the Protocol Buffer format. RAFT is actually a consensus algorithm for managing a replicated log and represents and alternative to Paxos. Ongaro and Ousterhout note that RAFT “separates the key elements of consensus, such as leader election, log replication, and safety, and it enforces a stronger degree of coherency to reduce the number of states that must be considered.” Diego Ongaro is lead software engineer, compute infrastructure at Salesforce and John Ousterhout is professor of computer science at Stanford University.

Consul has a REST API as well as domain name servers (DNS). Alone, DNS service discovery is basic, and maintains A records. Address or “A” records are one of the primary records used in DNS servers, but it maps a domain name to the internet protocol (IP) address (IPv4) of the computer hosting the domain. For example, if you enter domain.com, the A record might look like 202.34.77.253. Adoption of IP version six (IPv6) is increasing because it has protocol enhancements for security and has a much larger address space. IPv4 has four sets of numbers, while IPv6 has eight. A “quad A” is used as an alternative to the A record with IPv6 addresses. Quad A stands for AAAA record.

In a dynamic, ever-changing environment, Zookeeper propagates state updates with ZAB. It is a broadcast protocol. SkyDNS is similar, in that it is a distributed service to announce services, and is built on ETCD. It translates ETCD keys and values to the DNS.

With respect to server-side patterns, Chris Richardson notes, “When making a request to a service, the client makes a request via a router (a.k.a load balancer) that runs at a well-known location. The router queries a service registry, which might be built into the router, and forwards the request to an available service instance.”

DISCOVERY PATTERNS.

One problem is that many organizations have systems that also include legacy applications, but there are patterns that can be used to ameliorate this challenge. The administrator needs to determine which pattern fits their need based on their circumstances. There are four common discovery patterns: Self-registration, third-party, client-side, and server-side patterns. For microservices, there is also an anti-pattern with server-side discovery. Zookeeper also supports the client-side discovery pattern with legacy applications. Network locations discovered by the client. The client launches or triggers the connection in TCP, and the server listens and accepts the connection. Commonly, servers are software program operating remotely, and can be accessed from a user’s device, such as a mobile device or workstation. A client is a computer application that operates on a user’s device and connects to a server. On the internet, a web server can be accessed from a web browser, which is a client. With Zookeeper’s client-side discovery pattern, there is an integrated library, and the client performs load balancing. Zookeeper is written in Java and provides Java and C language bindings. In general, client-side discovery provides a service registry, which is independent of the API Gateway and the Microservice it supports or exposes. EVI does not need to draw on the load balancing functionality. EVI must offer value by extending the library based on a client-side pattern, when there are monolithic legacy applications. In contrast, the server-side discovery pattern runs a query to a service by way of the load balancer. The load balancer is aware of the service, not the client. The load balancer is dynamic and often uses DNS service registries in real-time. Server-side discovery can also substitute the load balancer with an API Gateway. This connects to the Service Registry as well as a Microservice. Both are independent of one another, but share access to the API Gateway. Workday is one of the most innovative vendors of financial applications with a software-as-a-service (SaaS) business model. Workday is deploying software-defined compute, networking and storage, within an Amazon-like availability zones datacenter architecture. It uses Docker deploy and scale images on instances across distributed systems, but started with OpenStack to make the transition to MSA.

OpenStack software controls large pools of compute, storage, and networking resources throughout a datacenter, managed through a dashboard or via the OpenStack API. OpenStack works with popular enterprise and open source technologies making it ideal for heterogeneous infrastructure.

OpenStack. Endpoints, backends, secure infrastructure and finding ways to reduce overhead are critical in the migration to MSA. OpenStack is a cloud operating system (OS) that manages compute, storage, and networking resources throughout a datacenter. It provides administrators with control, yet also distributes responsibility to users to provision resources through a web-based dashboard. OpenStack is both an infrastructure and platform layer and utilizes Magnum and Murano. Magnum is an infrastructure management service and Murano is an application catalog. Yet, OpenStack does not have a fully formed service discovery tool, although components such as Keystone, Nova and Neutron all offer ways to manage and catalog components and services, especially with containers. OpenStack also permits Mesos, Kubernetes, and Docker to run on top. There are third parties attempting to close the gap with OpenStack. NGINX Plus works with OpenStack Heat to offer a simpler DNS‑based service discovery to dynamically discover and update the IP addresses of backend servers via the DNS protocol. OpenStack also permits Mesos, Kubernetes, and Docker to run on top.