Tuesday, October 20, 2020

What is CI/CD? Continuous Integration and Continuous Delivery explained

The CI/CD pipeline is one of the best practices for DevOps teams to implement, for delivering code changes more frequently and reliably.

Continuous integration (CI) and continuous delivery (CD) embody a culture, set of operating principles, and collection of practices that enable application development teams to deliver code changes more frequently and reliably. The implementation is also known as the CI/CD pipeline.

CI/CD is one of the best practices for DevOps teams to implement. It is also an agile methodology best practice, as it enables software development teams to focus on meeting business requirements, code quality, and security because deployment steps are automated.

CI / CD defined

Continuous Integration is a coding philosophy and set of practices that drive development teams to implement small changes and check in code to version control repositories frequently. Because most modern applications require developing code in different platforms and tools, the team needs a mechanism to integrate and validate its changes.

The technical goal of CI is to establish a consistent and automated way to build, package, and test applications. With consistency in the integration process in place, teams are more likely to commit code changes more frequently, which leads to better collaboration and software quality.

Continuous delivery picks up where continuous integration ends. CD automates the delivery of applications to selected infrastructure environments. Most teams work with multiple environments other than the production, such as development and testing environments, and CD ensures there is an automated way to push code changes to them.

CI/CD tools help store the environment-specific parameters that must be packaged with each delivery. CI/CD automation then performs any necessary service calls to web servers, databases, and other services that may need to be restarted or follow other procedures when applications are deployed.

Continuous integration and continuous delivery require Continuous Testing because the objective is to deliver quality applications and code to users. Continuous testing is often implemented as a set of automated regression, performance, and other tests that are executed in the CI/CD pipeline.

For more info please go through the url below

https://dzone.com/articles/secure-and-scalable-cicd-pipeline-with-aws

https://eduinpro.com/blog/top-devops-tools-in-the-digital-market/

A mature CI/CD devops practice has the option of implementing continuous deployment where application changes run through the CI/CD pipeline and passing builds are deployed directly to production environments. Teams practicing continuous delivery elect to deploy to production on a daily or even hourly schedule, though continuous delivery isn’t always optimal for every business application.

Wednesday, September 16, 2020

Kubernetes101: Nodes, Clusters, Containers and Pods

Kubernetes is quickly becoming the new standard for deploying and managing software in the cloud.There are many different pieces that make up the system, and it can be hard to tell which ones are relevant for our use case. This post will provide a simplified view of Kubernetes, but it will attempt to give a high-level overview of the most important components and how they fit together.

First, lets look at how hardware is represented

Hardware

Nodes

A node is the smallest unit of computing hardware in Kubernetes. It is a representation of a single machine in our cluster. In most production systems, a node will likely be either a physical machine in a data center, or virtual machine hosted on a cloud provider.

Thinking of a machine as a “node” allows us to insert a layer of abstraction. Now, instead of worrying about the unique characteristics of any individual machine, we can instead simply view each machine as a set of CPU and RAM resources that can be utilized. In this way, any machine can substitute any other machine in a Kubernetes cluster.

The Cluster

Although working with individual nodes can be useful, it’s not the Kubernetes way. In general, we should think about the cluster as a whole, instead of worrying about the state of individual nodes.

In Kubernetes, nodes pool together their resources to form a more powerful machine. When we deploy programs onto the cluster, it intelligently handles distributing work to the individual nodes for us. If any nodes are added or removed, the cluster will shift around work as necessary. It shouldn’t matter to the program, or the programmer, which individual machines are actually running the code.

Persistent Volumes

Because programs running on our cluster aren’t guaranteed to run on a specific node, data can’t be saved to any arbitrary place in the file system. If a program tries to save data to a file for later, but is then relocated onto a new node, the file will no longer be where the program expects it to be. For this reason, the traditional local storage associated to each node is treated as a temporary cache to hold programs, but any data saved locally can not be expected to persist.

To store data permanently, Kubernetes uses Persistent Volumes. While the CPU and RAM resources of all nodes are effectively pooled and managed by the cluster, persistent file storage is not. Instead, local or cloud drives can be attached to the cluster as a Persistent Volume. This can be thought of as plugging an external hard drive in to the cluster. Persistent Volumes provide a file system that can be mounted to the cluster, without being associated with any particular node.

Software

Containers

Programs running on Kubernetes are packaged as Linux Containers. Containers are a widely accepted standard, so there are already many Pre-built Images that can be deployed on Kubernetes.

Containerization allows you to create self-contained Linux execution environments. Any program and all its dependencies can be bundled up into a single file and then shared on the internet. Anyone can download the container and deploy it on their infrastructure with very little setup required. Creating a container can be done programmatically, allowing powerful CI and CD pipelines to be formed.

Multiple programs can be added into a single container, but you should limit yourself to one process per container if at all possible. It’s better to have many small containers than one large one. If each container has a tight focus, updates are easier to deploy and issues are easier to diagnose.

Pods

Unlike other systems you may have used in the past, Kubernetes doesn’t run containers directly; instead it wraps one or more containers into a higher-level structure called a pod. Any containers in the same pod will share the same resources and local network. Containers can easily communicate with other containers in the same pod as though they were on the same machine while maintaining a degree of isolation from others.

Pods are used as the unit of replication in Kubernetes. If your application becomes too popular and a single pod instance can’t carry the load, Kubernetes can be configured to deploy new replicas of your pod to the cluster as necessary. Even when not under heavy load, it is standard to have multiple copies of a pod running at any time in a production system to allow load balancing and failure resistance.

Pods can hold multiple containers, but you should limit yourself when possible. Because pods are scaled up and down as a unit, all containers in a pod must scale together, regardless of their individual needs. This leads to wasted resources and an expensive bill. To resolve this, pods should remain as small as possible, typically holding only a main process and its tightly-coupled helper containers (these helper containers are typically referred to as “side-cars”).

Deployments

Although pods are the basic unit of computation in Kubernetes, they are not typically directly launched on a cluster. Instead, pods are usually managed by one more layer of abstraction: the deployment.

A deployment’s primary purpose is to declare how many replicas of a pod should be running at a time. When a deployment is added to the cluster, it will automatically spin up the requested number of pods, and then monitor them. If a pod dies, the deployment will automatically re-create it.

Using a deployment, you don’t have to deal with pods manually. You can just declare the desired state of the system, and it will be managed for you automatically.

Ingress

Using the concepts described above, you can create a cluster of nodes, and launch deployments of pods onto the cluster. There is one last problem to solve, however: allowing external traffic to your application.

By default, Kubernetes provides isolation between pods and the outside world. If you want to communicate with a service running in a pod, you have to open up a channel for communication. This is referred to as ingress.

There are multiple ways to add ingress to your cluster. The most common ways are by adding either an Ingress controller, or a Load Balancer. The exact trade offs between these two options are out of scope for this post, but you must be aware that ingress is something you need to handle before you can experiment with Kubernetes.

Saturday, August 22, 2020

Certificate Signing request (CSR), Private Key and Certificate Management

A Certificate Signing Request (CSR) is one of the first steps towards getting your own SSL Certificate. Generated on the same server you plan to install the certificate on, the CSR contains information (e.g. common name, organization, country) the Certificate Authority (CA) will use to create your certificate. In other way, a CSR is an encoded file that provides you with a standardized way to send DigiCert your public key as well as some information that identifies your company and domain name.

Private Keys and Public Keys terms are used in encryption and decryption. These keys are used to encrypt/decrypt sensitive information.

Private Key

The private key is used to both encrypt and decrypt the data. This key is shared between the sender and receiver of the encrypted sensitive information. The private key is also called symmetric being common for both parties. Private key cryptography is faster than public-key cryptography mechanism.

Public Key

The public key is used to encrypt and a private key is used decrypt the data. The private key is shared between the sender and receiver of the encrypted sensitive information. The public key is also called asymmetric cryptography.

The following are some of the important differences between Private Key and Public Key.

Generate a Private Key and a CSR together

[root@ ~]# openssl req -new -newkey rsa:2048 -nodes -keyout server5308.key -out server5308.csr

Generating a 2048 bit RSA private key

..................................................................................................+++

writing new private key to 'dewaserv5308.key'

You are about to be asked to enter information that will be incorporated

into your certificate request.

What you are about to enter is what is called a Distinguished Name or a DN.

There are quite a few fields but you can leave some blank

For some fields there will be a default value,

If you enter '.', the field will be left blank.

Country Name (2 letter code) [XX]:AE

State or Province Name (full name) []:DXB

Locality Name (eg, city) [Default City]:Dubai

Organization Name (eg, company) [Default Company Ltd]:XYZ

Organizational Unit Name (eg, section) []:AMI

Common Name (eg, your name or your server's hostname) []:xyz.smartgrid.local

Email Address []:xyz@gmail.com

Please enter the following 'extra' attributes

to be sent with your certificate request

A challenge password []:xdsarrdddd

An optional company name []:XYZ

Generate a CSR from an Existing Private Key

if you already have a private key of the server

[root@ ~]# openssl req -key server5308.key -new -out server5308.csr

Generate a CSR from an Existing Certificate and Private Key

if you already have a private key and Certificate of the server

[root@ ~]# openssl x509 -in server5308.crt -signkey server5308.key -x509toreq -out server5308.csr

Generate a Self-Signed Certificate and private key

[root@ ~]#openssl req -newkey rsa:2048 -nodes -keyout server5308.key -x509 -days 365 -out server5308.crt

Generate a Self-Signed Certificate from an Existing Private Key

if you already have a private key of the server

[root@ ~]# openssl req -key server5308.key -new -x509 -days 365 -out server5308.crt

Generate a Self-Signed Certificate from an Existing Private Key and CSR

if you already have a private key and CSR of the server

[root@ ~]# openssl x509 -signkey server5308.key -in server5308.csr -req -days 365 -out server5308.crt

View CSR entries

[root@ ~]#openssl req -text -noout -verify -in server5308.csr

View private key entries

[root@ ~]#openssl rsa -check -in server5308.key

View Certificate Entries

[root@ ~]#openssl x509 -text -noout -in server5308.crt

Verify a Certificate was Signed by a CA

[root@ ~]#openssl verify -verbose -CAFile ca.crt server5308.crt

[root@ ~]#openssl verify -CAfile cacert.pem xyz.smartgrid.local.pem

Verify a Private Key Matches a Certificate and CSR

[root@ ~]#openssl rsa -noout -modulus -in server5308.key | openssl md5

[root@ ~]#openssl x509 -noout -modulus -in server5308.crt | openssl md5

[root@ ~]#openssl req -noout -modulus -in server5308.csr | openssl md5

Check the issuer and CN(common name)

[root@ ~]#cd /etc/httpd/ssl

[root@ ~]#openssl x509 -in xyz.smartgrid.local.pem -noout -subject

[root@ ~]#openssl x509 -inxyz.smartgrid.local.pem -noout -issuer -subject

Creating pkcs file

[root@ ~]#openssl pkcs12 -export -in xyz.smartgrid.local.pem -inkey xyz.smartgrid.local.key -out xyz.smartgrid.local.p12

[root@ ~]#openssl pkcs12 -export -in xyz.smartgrid.local.pem -inkey xyz.smartgrid.local.key -out xyz.smartgrid.local.p12 -chain -CAfile cacert.pem

Thursday, August 20, 2020

KAFKA

What is Kafka?

Apache Kafka is a distributed event streaming platform capable of handling trillions of events a day. This was initially conceived as a messaging Queue because Kafka is based on an abstraction of a distributed committed log. Since being created and open sourced by LinkedIn in 2011, Kafka has quickly evolved from messaging queue to a full fledged event streaming platform.

To be precise, Kafka is used for publishing and subscribing events. The basic concepts of Kafka are comparable to traditional messaging systems. Producers publish messages onto topics.Consumers subscribe to topics and receive messages.

Excursus

• A topic is a category to which messages are published. Topics consist of at least one partition.

• Partitions contain an ordered sequence of messages. The messages are stored in regular files.

• The order of messages within a given partition is the same across the whole cluster (“totally ordered”). All producers and all consumers see messages of one partition in the same order.

• There is no order however between messages on different partitions. A total order of messages in a topic can be achieved by only having one partition for this topic.

• With multiple partitions, each partition may be consumed by a different consumer of the same consumer group. Kafka guarantees that each partition will be assigned to exactly one consumer of a group. A partition may of course be assigned to multiple consumers, each part of a different consumer group. Producers choose onto which partition messages are published to. This can be round-robin or a domain based partition.

Simple Testing

- 1. Download Kafka from below ftp site

http://ftp.nluug.nl/internet/apache/kafka/2.2.0/kafka_2.12-2.2.0.tgz

- 2. Enter the /opt directory, and extract the archive:

cd /opt

tar -xvf kafka_2.12-2.2.0.tgz

3. Create a symlink called /opt/kafka that points to the now created /opt/ kafka_2.12-2.2.0 directory to make our lives easier.

ln -s /opt/kafka_2.12-2.2.0 /opt/kafka

4. Create a non-privileged user that will run both zookeeper and kafka service.

useradd kafka

5. Set the new user as owner of the whole directory we extracted, recursively

chown -R kafka:kafka /opt/kafka*

6.Create the unit file /etc/systemd/system/zookeeper.service with the following content:

[Unit]

Description=zookeeper

After=syslog.target network.target

[Service]

Type=simple

User=kafka

Group=kafka

ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties

ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh

[Install]

WantedBy=multi-user.target

Note that we do not need to write the version number three times because of the symlink we created. The same applies to the next unit file for Kafka, /etc/systemd/system/kafka.service, that contains the following lines of configuration:

[Unit]

Description=Apache Kafka

Requires=zookeeper.service

After=zookeeper.service

[Service]

Type=simple

User=kafka

Group=kafka

ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties

ExecStop=/opt/kafka/bin/kafka-server-stop.sh

[Install]

WantedBy=multi-user.target

- 7. Need to reload systemd to get it read the new unit files:

systemctl daemon-reload

8. We can start our new services (in this order):

systemctl start zookeeper

systemctl start kafka

If all goes well, systemd should report running state on both service's status, similar to the outputs below:

# systemctl status zookeeper.service

zookeeper.service - zookeeper

Loaded: loaded (/etc/systemd/system/zookeeper.service; disabled; vendor preset: disabled)

Active: active (running) since Thu 2019-01-10 20:44:37 CET; 6s ago

Main PID: 11628 (java)

Tasks: 23 (limit: 12544)

Memory: 57.0M

CGroup: /system.slice/zookeeper.service

11628 java -Xmx512M -Xms512M -server [...]

# systemctl status kafka.service

kafka.service - Apache Kafka

Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset: disabled)

Active: active (running) since Thu 2019-01-10 20:45:11 CET; 11s ago

Main PID: 11949 (java)

Tasks: 64 (limit: 12544)

Memory: 322.2M

CGroup: /system.slice/kafka.service

11949 java -Xmx1G -Xms1G -server [...]

- 9. Optionally we can enable automatic start on boot for both services

# systemctl enable zookeeper.service

# systemctl enable kafka.service

To test functionality, we'll connect to Kafka with one producer and one consumer client. The messages provided by the producer should appear on the console of the consumer. But before this we need a medium these two exchange messages on. We create a new channel of data called topic in Kafka's terms, where the provider will publish, and where the consumer will subscribe to. We'll call the topic FirstKafkaTopic. We'll use the kafka user to create the topic:

#su - kafka

$ /opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic FirstKafkaTopic

10. Start a consumer client from the command line that will subscribe to the (at this point empty) topic created in the previous step:

/opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic FirstKafkaTopic --from-beginning

We leave the console and the client running in it open. This console is where we will receive the message we publish with the producer client.

11. On another terminal, we start a producer client, and publish some messages to the topic we created. We can query Kafka for available topics:

$ /opt/kafka/bin/kafka-topics.sh --list --zookeeper localhost:2181 FirstKafkaTopic

12. And connect to the one the consumer is subscribed, then send a message:

$ /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic FirstKafkaTopic

> new message publish ed by producer from console #2

13. At the consumer terminal, the message should appear shortly:

$ /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic FirstKafkaTopic --from-beginning

new message published by producer from console #2

If the message appears, our test is successful, and our Kafka installation is working as intended. Many clients could provide and consume one or more topic records the same way, even with a single node setup we created in this tutorial.