Vishwanath Nayak

Basically, it's a strategy behind building infrastructure that scales:

1. Containerization

↳ Package applications with all dependencies

↳ Ensure consistent behavior across environments

↳ Docker simplifies build, ship, and run workflows

↳ Enables microservices architecture

↳ Perfect for DevOps practices

2. Container Orchestration

↳ Kubernetes leads the way here

↳ Automates container deployment & scaling

↳ Handles load balancing

↳ Manages container health

↳ Enables zero-downtime deployments

↳ Supports multi-cloud deployments

3. Infrastructure as Code (IaC)

↳ Terraform for multi-cloud provisioning

↳ CloudFormation for AWS-specific workloads

↳ Ansible for configuration management

↳ Version control your infrastructure

↳ Test before deployment

↳ Implement infrastructure CI/CD

↳ Maintain consistent environments

4. GitOps Workflow (not in this sheet)

↳ Git as single source of truth

↳ Automated infrastructure updates

↳ Pull-based deployments

↳ Built-in audit trail

↳ Easy rollbacks when needed

↳ Improved security & compliance

Why this matters?

→ Improved availability and scalability

→ Consistent deployments

→ Optimized costs

→ Minimal human error

The key thing is, you treat infrastructure like code, and I mean literally -

Making it versionable, testable, and reusable!

1. You deployed an Ansible playbook, but it fails on some hosts while working on others. How would you debug and resolve this?

Steps to debug:

Use ansible-playbook -vvv playbook.yml to enable verbose mode for more detailed logs.
Check the error message and failed task details.
Run ansible all -m ping to verify host connectivity.
Use ansible all -m setup to check system facts and detect OS/config differences.
Verify inventory configuration and ensure host groups are correctly assigned.
Run affected tasks in isolation using --limit <hostname> and --step to execute interactively.

Resolution:

If the issue is environment-specific, modify the playbook to handle different OS versions or missing dependencies using when conditions.
Fix missing permissions by using become: yes where required.

2. Your Ansible playbook takes too long to execute. How would you optimize it for faster performance?

Optimization techniques:

Use serial in playbooks to run tasks in parallel instead of executing sequentially.
Use fact caching (fact_caching: jsonfile) to avoid gathering facts repeatedly.
Use async & poll for long-running tasks to execute asynchronously.
Reduce SSH overhead by setting pipelining = True in ansible.cfg.
Optimize loops by using with_items instead of separate tasks.
Use changed_when and check_mode to avoid unnecessary task execution.

3. You need to ensure that Ansible applies configurations only if there’s a change. How would you implement this?

Use idempotent modules like copy, template, lineinfile, and file, which apply changes only when required.
Use changed_when in shell/command tasks to detect actual changes.
Use notify handlers to trigger actions only when a task reports a change.
Use check_mode: yes to dry-run and see changes before applying them.

4. After an Ansible update, your playbooks start failing due to deprecated modules. How do you handle this situation?

Steps to resolve:

Identify deprecated modules by running ansible-playbook --check -vvv playbook.yml.
Check the Ansible release notes for alternative modules.
Modify playbooks to use recommended replacements (e.g., replacing ec2 with amazon.aws.ec2_instance).
Test the updated playbook in a non-production environment before deploying.
Pin the working Ansible version using pip install ansible==<working-version> if immediate migration isn’t possible.

5. Your Ansible playbook is executing but not making the expected changes to remote servers. How would you troubleshoot?

Run the playbook with --check --diff to preview changes before applying.
Check register variables to validate task outputs.
Ensure the changed_when condition is correctly set.
Verify correct become privileges are applied.
Check inventory and variables using ansible-inventory --graph.
Review logs using -vvv for deeper insights.

6. How would you design an Ansible architecture to scale across multiple regions with minimal overhead?

Key considerations:

Use Ansible Tower/AWX to centralize execution and manage multiple environments.
Implement a dynamic inventory for cloud-based scaling (AWS, Azure, GCP).
Use delegate_to and local_action to reduce SSH connections.
Leverage pull-based execution using ansible-pull for distributed control.
Use fact caching and log aggregation for efficient execution.
Optimize networking by using regional jump hosts or bastion servers.

7. You need to ensure zero downtime while applying Ansible playbooks in production. What approach would you take?

Use rolling updates with serial: <N> to update a few hosts at a time.
Implement blue-green deployment by switching traffic between two environments.
Use canary deployment, applying changes to a subset before full rollout.
Ensure health checks before restarting critical services.
Use reboot module with pre/post checks to prevent downtime.

8. An Ansible playbook is stuck waiting for user input, causing automation failures. How would you fix it?

Check for tasks requiring interactive input and set -e "ansible_ask_pass=False".
Use no_log: true for sensitive prompts (passwords, secrets).
Pass required values via extra_vars (-e "var_name=value").
Modify prompts in playbook with default values to avoid manual input.

9. Your Ansible deployment in a hybrid cloud environment is failing due to network latency. What strategies can you use?

Use multiple inventory sources for better region-based execution.
Implement Ansible Tower/AWX to manage execution closer to target regions.
Reduce SSH connections by batching tasks with serial and forks.
Use ansible-pull for edge deployments to reduce network overhead.
Optimize playbook logic to minimize repeated connections.

10. You need to apply a security patch across 1,000+ servers using Ansible while ensuring rollback in case of failure. How would you do this?

Steps for safe deployment:

Take a backup before applying changes using the fetch or copy module.
Deploy in batches using serial: <N> to minimize impact.
Verify patch installation using a test_command before moving forward.
Rollback strategy:
- Use when conditions to apply rollback if verification fails.
- Store a snapshot of critical files (tar backup).
- Revert to previous packages if necessary (yum history undo or dpkg --remove).
Monitor & alert: Use notify handlers to report failures immediately.

Vishwanath Nayak

Monday, March 10, 2025

Infrastructure as a code landscape

Thursday, March 6, 2025

Types of Databases

Wednesday, March 5, 2025

Key-Value Stores

Friday, February 7, 2025

Ansible Scenario based interview question & Answers

1. You deployed an Ansible playbook, but it fails on some hosts while working on others. How would you debug and resolve this?

2. Your Ansible playbook takes too long to execute. How would you optimize it for faster performance?

3. You need to ensure that Ansible applies configurations only if there’s a change. How would you implement this?

4. After an Ansible update, your playbooks start failing due to deprecated modules. How do you handle this situation?

5. Your Ansible playbook is executing but not making the expected changes to remote servers. How would you troubleshoot?

6. How would you design an Ansible architecture to scale across multiple regions with minimal overhead?

7. You need to ensure zero downtime while applying Ansible playbooks in production. What approach would you take?

8. An Ansible playbook is stuck waiting for user input, causing automation failures. How would you fix it?

9. Your Ansible deployment in a hybrid cloud environment is failing due to network latency. What strategies can you use?

10. You need to apply a security patch across 1,000+ servers using Ansible while ensuring rollback in case of failure. How would you do this?

Tuesday, February 4, 2025

Essential HTTP Status Codes

Monday, February 3, 2025

DevOps Vs SRE vs Platform Engineering

Friday, December 6, 2024

Top 9 system integrations