To Integrate All Concepts using Python

Akshay Gupta
3 min readNov 9, 2020

Main-Script

The main script will help u to choose on which technology you want to work on:

  1. Configure Redhat Linux OS
  2. Configure Docker containers
  3. Configure Hadoop and Hadoop cluster
  4. Configure AWS

Redhat Linux os

Red Hat® Enterprise Linux® is the world’s leading enterprise Linux platform.* It’s an open-source operating system (OS). It’s the foundation from which you can scale existing apps — and roll out emerging technologies — across bare-metal, virtual, container and all types of cloud environments.

By Linux Script you can see date & calender, create a directory, see your memory details, see your Hard Disk details, setup webserver, check the connection, change to the root user, install any software, start any service or daemon, Reboot your machine and shut down your machine by just pressing 1, 2 and 3.

Linux can serve as the basis for nearly any type of IT initiative, including containers, cloud-native applications, and security.

Docker Container

A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.

Container images become containers at runtime and in the case of Docker containers — images become containers when they run on Docker Engine.

With the help of python and docker integration, you can check the status of docker, check docker info, check the images available, check the containers launched and launch a container by just pressing 1.

Hadoop

A Hadoop cluster is a special type of computational cluster designed specifically for storing and analyzing huge amounts of unstructured data in a distributed computing environment.

Typically one machine in the cluster is designated as the NameNode and another machine as the ResourceManager, exclusively. The rest of the machines in the cluster act as both DataNode and NodeManager. These are the workers.

Configuring Hadoop

Hadoop’s Java configuration is driven by two types of important configuration files:

Read-only default configuration — core-default.xml, hdfs-default.xml, yarn-default.xml and mapred-default.xml.

Site-specific configuration — etc/hadoop/core-site.xml, etc/hadoop/hdfs-site.xml, etc/hadoop/yarn-site.xml and etc/hadoop/mapred-site.xml.

Additionally, you can control the Hadoop scripts found in the bin/ directory of the distribution, by setting site-specific values via the etc/hadoop/hadoop-env.sh and etc/hadoop/yarn-env.sh.

To configure the Hadoop cluster you will need to configure the environment in which the Hadoop daemons execute as well as the configuration parameters for the Hadoop daemons.

HDFS daemons are NameNode, SecondaryNameNode, and DataNode.

Integration of python and Hadoop is done by giving the details about namenode, data node and client you can configure and start the nodes automatically. A complete cluster is formed.

Configure Amazon Web Services

AWS (Amazon Web Services) is a comprehensive, evolving cloud computing platform provided by Amazon that includes a mixture of infrastructure as a service (IaaS), platform as a service (PaaS) and packaged software as a service (SaaS) offerings. AWS services can offer an organization tools such as compute power, database storage and content delivery services.

AWS launched in 2006 from the internal infrastructure that Amazon.com built to handle its online retail operations. AWS was one of the first companies to introduce a pay-as-you-go cloud computing model that scales to provide users with compute, storage or throughput as needed.

The AWS technology is implemented at server farms throughout the world, and maintained by the Amazon subsidiary. Fees are based on a combination of usage (known as a “Pay-as-you-go” model), hardware, operating system, software, or networking features chosen by the subscriber required availability, redundancy, security, and service options. Subscribers can pay for a single virtual AWS computer, a dedicated physical computer, or clusters of either.

By python and AWS integration you can automate AWS.

By just pressing few numbers you can Create Key Pair, Create Security Group, Add Ingress Rules to Existing Security Group, Launch Instance on Cloud, Create EBS Volume, Attach EBS Volume to EC2 Instance, Configure WebServer, Create Static Partition and Mount /var/www/html folder on EBS volume, Create S3 Bucket, Object inside S3 bucket and make it public accessible, to remove specific Object from S3 bucket, delete Specific S3 Bucket, create Cloudfront distribution providing S3 as Origin, delete Key Pair, Stop EC2-Instances, Start Ec2-Instances, terminate Ec2-Instances, delete Security group and Go back to the previous menu.

--

--