Manish Shirke

Engineering Solutions. Delivering Results.

  • Home
  • Mission
  • Expertise
  • Accomplishments
  • Life Outside Work
  • Architecting Agentic AI: Horizontal vs. Vertical Agents Explained

    By Manish Shirke
    AI Systems Architect & DevOps Leader
    Exploring the future of intelligent automation

    Welcome. Today we will discuss about one of the most talked-about ideas in modern technology—neural networks—and break it down in plain developer language. If you’ve ever looked at research papers or AI blogs and felt buried in equations, don’t worry. By the end of this article, you’ll see that neural networks are built on concepts you already know: functions, loops, objects, and data transformations.

    What Is a Neural Network?
    Imagine you’re writing a function in Python. It takes an input, does some calculations, and gives an output. That’s all a neural network is—just a giant function. But here’s the twist: instead of you writing all the if-else statements or formulas, the network’s parameters—its weights and biases—are tuned using data until the function produces the right outputs.

    Let’s compare:
    • Traditional software: Rules + input → output.
    • Neural networks: Data + output → rules.

    For example, if you wanted to write a spam filter the old way, you’d write a set of rules:
    • If the subject contains “free money,” mark as spam.
    • If the sender is not in contacts, increase spam score.
    • If there are ten exclamation marks, probably spam.

    But the internet evolves faster than we can write rules. Neural networks solve this by automatically figuring out the “rules” from thousands of examples.
    It’s like hiring a junior developer who reads all the code and figures out patterns—except this developer works with numbers, not words.

    The Core Building Blocks
    So what are these “neurons” people talk about? Forget biology for a second. A neuron in this context is just a small formula:

    output = activation_function( weighted_sum_of_inputs )

    Let’s unpack that:
    • Inputs : These are the signals coming from the previous layer’s neurons or the initial data features.
    • Weights: These are parameters or coefficients that the network learns during training. They determine the strength or importance of each input.
    • Weighted Sum of Inputs : The core process starts with the weighted sum of inputs. If a neuron receives n inputs, x1​,x2​,…,xn​, each is multiplied by a corresponding weight, w1​,w2​,…,wn​, and then all are summed together.
    • Bias: This is another learnable parameter which is a constant offset that adds flexibility and is typically added to this sum. It allows the neuron to shift the activation function (making it easier or harder to activate) regardless of the input’s value. 
    • Mathematically, this intermediate result (often called the net input, or z) is: 
    weighted_sum_of_inputs=z=(i=1∑n​wi​xi​)+b
    • Activation function : The result of the weighted sum (z) is then passed through an activation function (σ). The Activation function introduces non-linearity into the network. Without it, the network would only be able to learn linear relationships, no matter how many layers it had. Common examples include ReLU (Rectified Linear Unit), Sigmoid, and Tanh.
    output = σ( z ) 
    • Output: This is the final value produced by the neuron, which then serves as an input to the neurons in the next layer, or as the final prediction.

    In essence, the neuron first calculates a linear combination of its inputs (the weighted sum) and then applies a non-linear transformation (the activation function) to generate its final output.

    If you’re a developer, this looks a lot like:
    def neuron(inputs, weights, bias):
    return activation(dot(inputs, weights) + bias)

    That’s all! A neuron is just a little calculator that takes numbers in, multiplies, adds, and applies a function.

    Now put a bunch of these neurons side by side—you get a layer.
    Stack several layers—you get a network.
    And when you stack many layers deep—that’s deep learning.

    Networks as Composable Functions
    Think of a neural network as nested function calls:
    def network(x):
    x = layer1(x)
    x = layer2(x)
    x = layer3(x)
    return x

    Each layer transforms the data a little bit. The first might extract rough patterns, like edges in an image. The middle layers refine them into shapes. The final layer says, “Aha, that’s a cat.”


    If you’ve ever piped Unix commands together—cat file | grep error | sort—that’s basically how layers in a neural network work. Each one does a tiny transformation, and the magic comes from composition.

    Training – The Learning Process
    This is where things get interesting. How does a network actually learn?
    1. Forward pass: Data flows in, output comes out.
    2. Loss function: Compare output to the correct answer. Calculate error.
    3. Backward pass (backpropagation): Work backwards to figure out which weights contributed to the error.
    4. Weight update: Adjust weights slightly using an optimizer such as stochastic gradient descent (SGD) or Adam.

    Now, what does that mean?
    SGD: Think of it like climbing down a mountain in the fog. You don’t see the whole slope, but you can feel the steepest direction right under your feet and take one small step at a time. That’s stochastic gradient descent — it updates weights step by step using small random batches of data.
    Adam: This is a smarter hiker with memory. Adam not only looks at the current slope but also remembers which directions worked well before, adjusting its step size adaptively. This makes training faster and more stable for many problems.
    Repeat this process over thousands—or millions—of examples. Eventually, the network gets really good at predicting outputs.
    For developers: imagine writing a program that auto-refactors itself every time a test fails. That’s exactly what a neural network does—except instead of code, it’s tweaking weights.

    A Developer-Friendly Analogy
    Think of weights as configuration values in a giant .env file.
    • At the start, they’re all random. Your app won’t run properly.
    • During training, you run tests, log failures, and tweak the values.
    • After enough iterations, the config is dialed in, and the app runs smoothly.
    Training is just automated config tuning at massive scale.

    Why Neural Networks Are Powerful. Why bother with all this complexity?Because neural networks can represent nonlinear functions.
    A linear model can only draw straight lines. That’s fine if your data is simple. But real life is messy—faces, voices, languages, stock prices.
    By stacking nonlinear layers, neural networks can approximate insanely complex functions. That’s why they’re used in:
    • Image recognition
    • Speech to text
    • Self-driving cars
    • Large language models like ChatGPT

    Most operations boil down to matrix multiplications and nonlinear functions — with special structures like convolutions or attention mechanisms layered on top.

    A Small Code Example
    Here’s a tiny network in PyTorch:
    import torch.nn as nn
    class TinyNet(nn.Module):
    def __init__(self):
    super(TinyNet, self).__init__()
    self.fc1 = nn.Linear(10, 20)
    self.fc2 = nn.Linear(20, 1)

    def forward(self, x):
    x = torch.relu(self.fc1(x))
    x = torch.sigmoid(self.fc2(x))
    return x


    Explanation in plain words of the above code :
    import torch.nn as nn allows access PyTorch’s neural network building blocks libraries.
    The class TinyNet(nn.Module) inherits from nn.Module, which provides all the necessary functionality for tracking parameters, movement between devices, and training/evaluation modes.
    The __init__(self) function is the constructor for the neural network class. Its purpose is to define and initialize all the layers and associated learnable parameters (weights and biases) that form the network’s structure. It calls the parent class constructor super(TinyNet, self).__init__() which is mandatory for any PyTorch module, as it properly initializes the parent class (nn.Module), allowing PyTorch to correctly track the network’s parameters and state. 
    Then it defines the First Linear Layer 
    self.fc1 = nn.Linear(10, 20)
    It specifies that this layer expects 10 input features and will output 20 features. PyTorch automatically generates and initializes the 10×20 weight matrix and a bias vector of size 20, which are the parameters that the network will learn.
    Then it defines the Second (Output) Linear Layer
    self.fc2 = nn.Linear(20, 1)
    It specifies that this layer takes the 20 features from the first layer’s output as input and produces 1 final output feature. This sets up the 20×1 weight matrix and a bias vector of size 1.
    In summary, the __init__ method establishes the complete architecture(10→20→1) and reserves the necessary memory for all the parameters that will be adjusted during the training phase.
    The forward(self, x) function is the most crucial part of a PyTorch neural network module. It defines the computational path or the flow of datathrough the network—how an input x is transformed into an output prediction. It is automatically called when you pass data to an instance of the network, like model(data).

    The function takes an input tensor, x, and processes it sequentially through the defined layers and activation functions:
    First Layer Processing : x = torch.relu(self.fc1(x)) 
    The input data x (which has 10 features) is passed through the first fully connected (linear) layer, self.fc1. This performs the weighted sum of inputs and adds the bias (i.e., W1​x+b1​). The output is a tensor with 20 features. The resulting 20-feature output is immediately passed through the ReLU (Rectified Linear Unit) activation function torch.relu(). This introduces the necessary non-linearity that allows the network to learn complex patterns.

    Second Layer Processing: x = torch.sigmoid(self.fc2(x))
    The output from the previous First layer Processing step (now 20 features) is passed through the second fully connected layer, self.fc2. This performs its own weighted sum and bias addition (i.e., W2​(ReLU(…))+b2​). The output is a single feature (1-dimensional).
    The final single-feature result is passed through the Sigmoid activation function torch.sigmoid()​. This function squashes the output into a range between 0 and 1 .

    Output: return x
    The function returns the final, processed output tensor, which is the network’s prediction .The output (a single value between 0 and 1)

    That’s a complete network in under 15 lines. Notice how it looks like normal class-based OOP code. So, as a developer, don’t think of neural networks as alien objects. Think of them as libraries giving you a convenient abstraction over math.

    Common Misconceptions
    Before we wrap up, let’s clear the air:
    • Neural networks are not magic. They’re just math and data.
    • More layers ≠ better results. Bigger models are harder to train.
    • They don’t “understand” like humans do—they just learn statistical patterns.
    • Sometimes, simpler models like decision trees or even regex are faster and good enough.
    Knowing when not to use a neural network is just as important as knowing how they work.

    How Developers Should Think
    Here’s the mental model I recommend:
    • A neural network is a function that writes itself.
    • Training is like unit testing at scale—failures guide improvements.
    • Weights are just mutable state tuned over time.
    • Frameworks like TensorFlow, PyTorch, and Keras are your standard libraries—don’t reinvent the wheel.

    Closing Thoughts
    So that’s neural networks explained—not from the lens of math, but from the lens of a developer. They’re not mystical. They’re just functions and layers, trained on massive amounts of data.

    shirkem

    September 27, 2025
    Uncategorized
  • Architecting Agentic AI: Horizontal vs. Vertical Agents Explained

    By Manish Shirke
    AI Systems Architect & DevOps Leader
    Exploring the future of intelligent automation

    Artificial Intelligence is everywhere—powering chatbots, detecting fraud, even writing code. But what exactly is AI? In this video, LearningFoundationGuru will give IT professionals a simple, no-nonsense explanation.

    By the end of this video, you’ll know what Artificial Intelligence really means, how it works at a high level, and why it matters for anyone working in IT today.

    Section 1: What AI Really Is
    Artificial Intelligence, or AI, is the field of computer science focused on building systems that can perform tasks normally requiring human intelligence—like understanding language, recognizing patterns, or making decisions.

    Under AI, we often hear about Machine Learning and Deep Learning.

    Machine Learning means teaching systems to learn from data instead of hard-coded rules.

    Deep Learning is a branch of Machine Learning that uses layered neural networks, inspired by the human brain, to handle very complex tasks like image recognition or speech understanding.

    Think of it like this: an old spam filter used fixed rules. A modern spam filter learns from thousands of emails, adapting over time. That’s the difference AI makes.

    Section 2: How AI Works

    At its core, AI follows a simple pipeline.

    First: Data.  – AI systems consume massive amounts of structured and unstructured data.

    Second: Training. –  Algorithms analyze the data to learn patterns.

    Third: Inference. –  Once trained, the system applies what it learned to new inputs and makes predictions.

    Imagine a new IT trainee. At first, they shadow you, studying logs and watching how you solve problems. 

    Over time, they practice under supervision. Eventually, they can handle tickets on their own. 

    That’s how AI works—with data playing the role of the training experience.

    Section 3: Why AI Matters for IT Pros

    For IT professionals, AI is not just a buzzword—it’s a tool.

    In automation, AI can analyze system logs, predict outages, and route tickets.

    In cloud services, platforms like Azure Cognitive Services, AWS SageMaker, and Google Vertex AI make AI accessible without deep research expertise.

    In productivity, AI copilots now write scripts, generate configuration files, and even create documentation.

    The future is clear: AI will not replace IT professionals. But IT professionals who learn how to use AI will replace those who don’t.

    Section 4: Common Misconceptions
    Let’s clear up a few myths.
    AI is not magic—it’s statistics, computing power, and data.
    AI is not human-level intelligence—it’s narrow, specialized, and task-specific.
    And AI will not take all IT jobs. Instead, it will shift them—making skills like data literacy, automation design, and prompt engineering more valuable.”

    Conclusion :
    To sum up: Artificial Intelligence is simply computers learning from data to perform tasks more intelligently. For IT pros, it’s a game-changing tool that enhances automation, productivity, and innovation.

    If you found this video helpful, don’t forget to like, subscribe, and share. 
    And stay tuned—LearningFoundationGuru will soon cover practical AI tools you can start using in your IT workflows today.”🚀 Subscribe to LearningFoundationGuru on YouTube for more IT insights and simple AI explainers: 

     👉 https://youtube.com/@LearningFoundationGuru

    shirkem

    September 17, 2025
    Uncategorized
  • Architecting Agentic AI: Horizontal vs. Vertical Agents Explained

    How design, AI, and modern tooling are collapsing the gap between ideas and production-ready software.
    By Manish Shirke
    AI Systems Architect & DevOps Leader
    Exploring the future of intelligent automation

    Introduction

    Recently, I had the privilege of attending a fascinating podcast session on “AI-Powered Design: From Concept to Code” hosted on Udemy. The conversation showcased how artificial intelligence is reshaping the traditional designer–developer workflow, empowering designers to move from early sketches to production-ready applications faster than ever before.

    The core message: AI isn’t just a helper anymore—it’s becoming a creative collaborator.


    The Modern Tool Stack

    The session highlighted a stack of tools that work together almost like a relay race, handing off value at every stage of the process:

    • Notion → Capturing requirements, brainstorming features, and aligning with stakeholders in real-time.
    • Figma → Translating requirements into pixel-perfect, static UI/UX mockups.
    • Vercel V0 → Generating production-grade React/Next.js code directly from Figma designs.
    • Claude (Anthropic) → Handling backend logic, writing APIs, and even generating business logic in natural language.
    • Cursor IDE → An AI-native development environment where Claude lives inside the editor, providing execution, debugging, and iteration loops.

    This stack demonstrates how design intent flows seamlessly into deployable code.

    Flow Diagram

    Concepts evolve into mockups, code is auto-generated, AI refines backend logic, and the final product is deployed—all in one continuous loop.

    Key Insight

    We are witnessing a paradigm shift:
    Designers are no longer limited to creating static artifacts. With AI as a partner, they are emerging as builders — able to bring their concepts to life, test them in real environments, and ship working products without waiting for hand-offs.


    Why It Matters

    • Speed → Compresses the time from idea to MVP (Minimum Viable Product), letting teams validate concepts with real users faster.
    • Collaboration → AI tools integrate smoothly with human creativity, reducing friction between teams.
    • Accessibility → Lowers the barrier for non-developers to actively shape software.

    This convergence is more than a productivity boost—it signals a new era where design and code are no longer separate disciplines but two ends of the same creative process.


    Closing Thoughts

    The synergy of tools like Notion, Figma, Vercel V0, Claude, and Cursor IDE isn’t just a technical convenience; it’s a cultural change. AI is becoming a bridge that allows designers, product thinkers, and developers to collaborate in real-time, moving ideas to production with unprecedented speed.

    We’re entering a future where every designer can become a creator—and every creator can leverage AI as a trusted partner.

    shirkem

    September 9, 2025
    Uncategorized
  • Architecting Agentic AI: Horizontal vs. Vertical Agents Explained

    Architecting Agentic AI: Horizontal vs. Vertical Agents Explained

    By Manish Shirke
    AI Systems Architect & DevOps Leader
    Exploring the future of intelligent automation




    In the evolving landscape of Agentic AI, understanding the distinction between horizontal and vertical agents is essential for designing effective systems.

    Horizontal Agents
    A horizontal agent is general-purpose and designed to perform a wide range of tasks across multiple domains. It functions like a universal assistant, capable of managing emails, scheduling, ordering food, controlling IoT devices, and more. Horizontal agents require strong memory, contextual awareness, and modular architecture to manage diverse functionalities.

    Vertical Agents
    Vertical agents, in contrast, are domain-specific. They are deeply integrated into a single application or workflow. These agents are tailored for tasks like Kubernetes troubleshooting, legal document analysis, or customer service in finance. Their narrow focus enables higher reliability and domain expertise.

    Comparing Horizontal and Vertical Agentic AI
    In practice, powerful Agentic AI systems often combine both types. A horizontal orchestrator delegates tasks to vertical specialists, akin to how a manager coordinates with domain experts. Understanding and leveraging this hierarchy is key to building scalable, efficient, and intelligent agentic systems.

    Following table compares Horizontal versus vertical agents:

    Let us see how Horizontal agents and vertical Agents are being used in real world. Below table shows some real world examples of Horizontal agents.

    Below table shows some real world examples of Vertical agents.

    Conclusion
    As AI continues to evolve from passive tools into autonomous agents, understanding the difference between horizontal and vertical agent architectures becomes essential for building effective, scalable systems. Horizontal agents excel in flexibility and orchestration, acting as generalists capable of navigating diverse tasks across domains. In contrast, vertical agents bring depth and specialization, offering domain-specific intelligence embedded within tightly integrated workflows.

    Real-world platforms like ChatGPT, AutoGPT, and Devin showcase the power of horizontal orchestration, while specialized systems such as Harvey (legal), Med‑PaLM (medical), PathAI (pathology), and Erica (banking) exemplify how vertical agents are reshaping industry-specific applications.

    Looking ahead, the most impactful AI ecosystems will likely combine the strengths of both models: horizontal agents orchestrating and delegating tasks to a federation of vertical experts. This hybrid approach will enable organizations to harness the power of general reasoning alongside domain mastery — driving productivity, personalization, and innovation at scale.

    shirkem

    August 2, 2025
    Uncategorized

Blog at WordPress.com.

  • Home
  • Mission
  • Expertise
  • Accomplishments
  • Life Outside Work
 

Loading Comments...
 

    • Subscribe Subscribed
      • Manish Shirke
      • Already have a WordPress.com account? Log in now.
      • Manish Shirke
      • Subscribe Subscribed
      • Sign up
      • Log in
      • Report this content
      • View site in Reader
      • Manage subscriptions
      • Collapse this bar