My Digital Universe – Ever wan·der

Not all explanations are created equal. For an explanation to be useful in practice, it must do more than highlight inputs or display weights. It needs to behave reliably and reflect how the model actually works.

In the growing field of explainable AI, tools like LIME and SHAP have made it possible to peek inside complex models and understand their reasoning. But just because a model can explain itself doesn’t mean every explanation is meaningful or trustworthy.

Evaluating the Quality of Explanations

Not all explanations are created equal. For an explanation to be useful in practice, it must do more than highlight inputs or display weights. It needs to behave reliably and reflect how the model actually works.

Two critical properties help assess that:

1. Consistency

A good explanation should behave consistently. That means:

If you train the same model on different subsets of similar data, the explanations should remain relatively stable.
Small changes to input data shouldn’t lead to dramatically different explanations.

Inconsistent explanations can confuse users, misrepresent what the model has learned, and signal overfitting or instability in the model itself.

2. Faithfulness

Faithfulness asks a simple but powerful question: Do the features highlighted in the explanation actually influence the model’s prediction?

An explanation is not faithful if it attributes importance to features that, when changed or removed, don’t affect the outcome. This kind of misleading output can erode trust and create false narratives around how the model operates.

Why These Metrics Matter

In sensitive applications like healthcare, lending, or security, misleading explanations are more than just technical flaws. They can have real-world consequences.

Imagine a credit scoring model that cites a user’s browser history or favorite color as key decision drivers. Even if the model is technically accurate, such explanations would damage its credibility and raise ethical and legal concerns.
In regulated industries, explanations that fail consistency or faithfulness checks can expose organizations to compliance risks and reputational damage.

Real-World Examples

Faithfulness Test: Credit Risk Model

A faithfulness test was applied to a credit risk model used to classify applicants as “high” or “low” risk. The SHAP explanation highlighted feature A (e.g., number of bank accounts) as highly important.

To test faithfulness, this feature was removed and the model’s prediction didn’t change … at all!

What the graph shows:

SHAP value for “Number of Bank Accounts” was +0.25 (suggesting a major contribution).
But after removing it, the model’s risk prediction stayed the same, proving that this feature wasn’t actually influencing the output.

This revealed a serious problem: the model was producing unfaithful explanations. It was surfacing irrelevant features as important, likely due to correlation artifacts in the training data.

Consistency Test: Credit Risk Model

A credit scoring model was trained on two different but similar subsets of loan application data. Both versions produced the same prediction for an applicant: “high risk”, but gave very different explanations.

What the graph shows:

In Training Set A, the top contributing feature was “Credit Utilization” (+0.3).
In Training Set B, it was “Employment Type” (+0.28).
The SHAP bar charts for the same applicant looked noticeably different, even though the final decision didn’t change.

This inconsistency raised questions about model stability: Can we trust that the model is learning the right patterns, or is it too sensitive to the training data?

Final Thoughts

As AI systems continue to make critical decisions in our lives, explainability is not a luxury, it’s a necessity. Tools like LIME and SHAP offer a valuable window into how models work, but that window needs to be clear and reliable.

Metrics like consistency and faithfulness help us evaluate the strength of those explanations. Without them, we risk mistaking noise for insights, or worse, making important decisions based on misleading information.

Accuracy might get a model deployed, but consistency and faithfulness should decide its validity and trust. If you want to learn more about explainability in AI, please check this blog post, where I talk about how LIME and SHAP can help explain model outcomes.

Understanding Model Decisions with SHAP and LIME

Academic Writing, Data Science, My Digital Universe

What Made the Model Say That? Real Examples of Explainable AI

April 8, 2025April 8, 2025 Cheran Ratnam

When people talk about artificial intelligence, especially deep learning, the conversation usually centers around accuracy and performance. How well does the model classify images? Can it outperform humans in pattern recognition? While these questions are valid, they miss a crucial piece of the puzzle: explainability.

Explainability is about understanding why an AI model makes a specific prediction. In high-stakes domains like healthcare, finance, or criminal justice, knowing the why is just as important as the what. Yet this topic is often overlooked in favor of performance benchmarks.

Why Is Explainability Hard in Deep Learning?

Classical models like decision trees (e.g., CART) offer built-in transparency. You can trace the decision path from root to leaf and understand the model’s logic. But deep learning models are different. They operate through layers of nonlinear transformations and millions of parameters. As a result, even domain experts can find their predictions opaque.

This can lead to problems:

Lack of trust from users or stakeholders
Difficulty debugging or improving models
Potential for hidden biases or unfair decisions

This is where explainability tools come in.

Tools That Help Open the Black Box

Two widely used frameworks for model explanation are LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations). Both aim to provide insights into which features influenced a specific prediction and by how much.

LIME in Action

LIME works by perturbing the input data and observing how the model’s predictions change. For instance, in a text classification task, LIME can highlight which words in an email led the model to flag it as spam. It does this by creating many variations of the email (e.g., removing or replacing words) and observing the output.

Loan Risk Example:

A model classifies a loan application as risky. We will use John as an example.
We want to find the reasons as to why their application was labeled as risky.
LIME reveals that the applicant’s job status and credit utilization were the two most influential factors.

LIME reveals that the model flagged John’s loan as risky mainly due to his contract employment status and high credit utilization. Although John had no previous defaults and a moderate income, those factors were outweighed by the others in the model’s decision.

SHAP in Practice

SHAP uses concepts from cooperative game theory to assign each feature an importance value. It ensures a more consistent and mathematically grounded explanation. SHAP values can be plotted to show how each input feature pushes the prediction higher or lower.

Medical Diagnosis Example:

Let’s use Maria as an example, after her information was entered into the system, she was classified as high risk by the model
To understand as to what factors contributed to that classification, we can use SHAP. SHAP shows that age and blood pressure significantly contributed to a high-risk prediction.
These insights help physicians verify if the model aligns with clinical reasoning.

Final Thoughts

The examples of Maria and John illustrate a powerful truth: even highly accurate models are incomplete without explanations. When a model labels someone as high-risk, whether for a disease or a loan default, it’s not enough to accept the outcome at face value. We need to understand why the model made that decision.

Tools like LIME and SHAP make this possible. They open up the black box and allow us to see which features mattered most, giving decision-makers the context they need to trust or challenge the model’s output.

Why Explainability Matters in Business:

Builds trust with stakeholders
Supports accountability in sensitive decisions
Uncovers potential biases or errors in the model
Aligns predictions with domain expertise

As AI becomes more embedded in real-world systems, explainability is not optional; it’s essential. It turns predictions into insights, and insights into informed, ethical decisions.

Good AI model evaluations doesn’t stop at explainability. Learn the importance of consistency and faithfulness and see why it matters by checking out this post.

Conflict-free scheduling: Ensures no two vehicles with intersecting paths enter at the same time.

Academic Writing, Data Science, My Digital Universe, Portfolio

Optimizing Traffic Flow: Efficient, but Is It Safe?

April 6, 2025April 8, 2025 Cheran Ratnam

Unsignalized intersections are managed without traffic lights. They rely on stop signs and right-of-way rules. These intersections are inherently riskier compared to traffic light enforced ones because now there is no lights and it depends on the driver paying attention to the stop sign, but that is a different matter altogether.

They’re common in suburban or low-traffic areas but increasingly challenged by growing traffic volumes and the emergence of Connected and Automated Vehicles (CAVs).

These intersections are friction points in modern traffic systems. And the problem often starts with one outdated rule: First-Come-First-Served (FCFS).

First-Come-First-Served

FCFS is a simple scheduling principle: vehicles cross in the order they arrive. If multiple vehicles approach, each waits for the ones ahead (yes, you are supposed to wait if the other person arrives at the stop sign before you) even if their paths don’t conflict.

Why It Falls Short

No spatial awareness: Vehicles wait even when their paths don’t intersect. This may not be a bad thing if your city or neighborhood has CRAZY drivers but it is not efficient right?
Ignores vehicle dynamics: No speed adjustments are used to reduce waiting time. Although you may be able to reply to a text or two? NO. Don’t text and drive!
Creates bottlenecks: Delays increase when vehicles arrive from different directions in quick succession. Oh well, your precious time.

In the animation above, each vehicle waits for the previous one to clear the intersection, even when there’s no collision risk. The result? Wasted time and unused intersection space. Well, that is if you only care about efficiency. Not so bad from a safety point of view.

Why FCFS Doesn’t Work for CAVs

As vehicles become more intelligent and connected, relying on a static rule like FCFS is inefficient. This is assuming that the person behind the wheel is also intelligent enough to practice caution and obey traffic rules and drives SOBER.

Modern CAVs can:

Share real-time location and speed data.
Coordinate with one another to avoid collisions.
Adjust their behavior dynamically.

FCFS fails to take advantage of these capabilities. It often causes unnecessary queuing, increasing delays even when safe, efficient crossings are possible through minor speed changes. Again, assuming that the drivers are all outstanding citizens with common sense, yes, this is not very efficient and there is room for improvement.

A Smarter Alternative: Conflict-Free, Real-Time Scheduling

This recent paper, named “An Optimal Scheduling Model for Connected Automated Vehicles at an Unsignalized Intersection” proposes a linear programming-based model to optimize flow at unsignalized intersections. The model is built for CAVs and focuses on minimizing average delay by scheduling optimal crossing times based on:

Vehicle location and direction
Potential conflict zones

Key Features of the Model

Conflict-free scheduling: Ensures no two vehicles with intersecting paths enter at the same time.
Rolling horizon optimization: Continuously updates schedules in real time.
Delay minimization: Vehicles adjust speed slightly instead of stopping.

In this visualization, vehicles coordinate seamlessly:

The red car enters first.
The gray car slows slightly to avoid a conflict.
The blue car times its approach to maintain flow.

No stopping. No wasted time. Just optimized motion.

Now that all sounds good to me. It sounds somewhat like a California Stop, if you know what I mean. But how can we trust the human to obey these more intricate optimization suggestions when people don’t even adhere to more simple rules like slowing down in a school zone? Ok, maybe a different topic. So let’s assume that these are all goody goodies behind the wheel and continue.

Performance: How the Model Compares to FCFS

According to the study’s simulations:

Up to 76.22% reduction in average vehicle delay compared to FCFS.
Real-time responsiveness using rolling optimization.
Faster computation than standard solvers like Gurobi, making it viable for live deployment.

The result? Smoother traffic, shorter waits, and better use of intersection capacity without traffic signals.

Rethinking the Rules of the Road

FCFS is simple but simplicity comes at a cost. In a connected, data-driven traffic ecosystem, rule-based systems like FCFS are no longer sufficient.

This study makes the case clear: real-time, model-based scheduling is the future of unsignalized intersection management. As cities move toward CAVs and smarter infrastructure, the ability to optimize traffic flow will become not just beneficial, but essential. That said, complexity also comes at a cost. If all the vehicles are autonomous and are controlled by a safe, optimized, and centralized algorithmic command center, this could work. But as soon as you introduce free agency, which is not a bad thing, but in this context it introduces a lot of risk, randomness, uncertainty, and CHAOS … one have to think about efficiency vs. practicality and safety.

If these CAVs are able to enter into a semi-controlled environment when they enter the parameter of the intersection, perhaps this approach could work. This means that while they are in the grid (defined by a region that leads up to the stop sign), the driver does loose some autonomy and their vehicle will be simulated by a central command … this might be a good solution to implement.

Either way, this is an interesting study. After all, we all want to get from point A to point B in the most efficient way possible. The less time we spend behind the wheel at stop signs, the more time we have for … hopefully not scrolling Tik Tok. But hey, even that is better than just sitting at a stop sign, right?

Data Science, My Digital Universe, Portfolio

What Large Language Models Really Are: Not Minds, Just Math with Tools

April 5, 2025April 5, 2025 Cheran Ratnam

LLMs like ChatGPT are often described as if they think. They don’t. At least not qiote like humans. Which, may not be a bad thing given the kind of stuff us humans have conjured up over the years.

Back in 2011, Daniel Kahneman introduced System 1 and System 2 thinking:

System 1: fast, intuitive, automatic
System 2: slow, deliberate, reasoned

LLMs are pure System 1 engines. They don’t reason. They don’t understand. They predict the next token … that’s it.

Every “intelligent” response is just a string of highly probable guesses. Step by step, word by word. Recent research on Agentic LLMs make this a tad bit interesting.

By plugging LLMs into tools for reasoning, retrieval, symbolic logic, interaction, we build the appearance of System 2 thinking:

Step-by-step prompting
Calling calculators or search engines
Planning with external tools
Interacting with other agents

This isn’t true deliberation. It’s orchestration. We’re layering deliberate behavior on top of probabilistic word prediction.

The magic of modern LLMs isn’t intelligence. It’s composition, blending fast token prediction with structured workflows and external tools. They’re not minds. They’re language interfaces made powerful through math, scale, and tool use.

Understanding this doesn’t undercut them. It makes us better at using them.

Bar chart showing ChatGPT and BERT agreement with researcher sentiment labels for positive, negative, and neutral categories

Academic Writing, Data Science, My Digital Universe, Portfolio

How Good is Sentiment Analysis? Even Humans Don’t Always Agree

April 4, 2025April 4, 2025 Cheran Ratnam

Sentiment Analysis Is Harder Than It Looks

Sentiment analysis is everywhere: from analyzing customer reviews to tracking political trends. But how accurate is it really? More importantly, how well can machines capture human emotion and nuance in text?

To answer that, I conducted a real-world comparison using a dataset of healthcare-related social media posts. I evaluated the performance of two AI models : ChatGPT and a BERT model fine-tuned on Reddit and Twitter data against human-coded sentiment labels. I used the fine-tuned BERT model and the dataset as part of my Ph.D dissertation. So this was already available to me. More information on the BERT model can be found in my disseration.

The results tell an interesting story about the strengths and limitations of sentiment analysis; not just for machines, but for humans too.

ChatGPT vs. BERT: Which One Came Closer to Human Labels?

Overall, ChatGPT performed better than the BERT model:

ChatGPT reached 59.83% agreement with human-coded sentiments on preprocessed text (meaning I used the dataset I used for the BERT model)
On raw text, agreement was 58.52% (here, I used the same dataset I provided the second human coder, without any preprocessing like lemmatization, etc)
The BERT model lagged behind in both scenarios

This shows that large language models like ChatGPT are moving sentiment analysis closer to human-level understanding; but the gains are nuanced. See below image that shows ChatGPT vs the Trained BERT agreement levels with my coding for each Reddit post. Note that this was when I used the preprocessed dataset and generated the ChatGPT output.

Class-by-Class Comparison: Where Each Model Excels

Looking beyond overall scores, I broke down agreement across each sentiment class. Here’s what I found:

Neutral Sentiment: ChatGPT led with 64.76% agreement, outperforming BERT’s 44.76%.
Positive Sentiment: BERT did better with 66.19% vs. ChatGPT’s 41.73%.
Negative Sentiment: Both struggled, with BERT at 26.09% and ChatGPT at 17.39%.

These results suggest that ChatGPT handles nuance and neutrality better, while BERT tends to over-assign positivity; a common pattern in models trained on social platforms like Reddit and Twitter.

But Wait … Even Humans Don’t Fully Agree

Here’s where it gets more interesting! When comparing two human coders on the same dataset, agreement was just 72.79% overall. Class-specific agreement levels were:

Neutral: 91.8%
Negative: 60%
Positive: Only 43.6%

This mirrors the model behavior. The subjectivity of sentiment, especially when it comes to borderline cases or ambiguous language, is challenging even humans!

Why Sentiment Is So Difficult … Even for Humans

As discussed in my dissertation, sentiment classification is impacted by:

Ambiguous or mixed emotions in a single post
Sarcasm and figurative language
Short posts with little context
Different human interpretations of tone and intent

In short: Sentiment is not just about word choice, it’s about context, subtlety, and perception. I tackle this in much more depth in my dissertation. So, if you want to read more about what other researchers are saying, I suggest you refer to Chapter 5, where I talk about sentiment analysis issues, explanations, and implications.

Here is the Spiel

ChatGPT outperformed a Reddit-Twitter trained BERT model in both overall accuracy and especially on neutral sentiment.
Positive and negative sentiment remain harder to classify, for both models and humans.
Even human coders don’t always agree, proving that sentiment is a subjective task by nature.
For applications in healthcare, finance, or policy; where precision matters, sentiment analysis needs to be interpreted carefully, not blindly trusted.

Final Thoughts

AI is getting better at understanding us, but it still has blind spots. As we continue to apply sentiment analysis in real-world domains, we must account for ambiguity, human disagreement, and context. More importantly, we need to acknowledge that even “ground truth” isn’t always absolute.

Let’s keep pushing the boundaries, but with a healthy respect for the complexity of human emotion.

Diagram illustrating how a large language model (LLM) answers questions using ontology embeddings, Chain-of-Thought prompting, and Retrieval-Augmented Generation from a knowledge graph.

Academic Writing, Data Science, My Digital Universe, Portfolio

Revolutionizing Data Interaction: How AI Can Comprehend Your Evolving Data Without Retraining

April 2, 2025April 2, 2025 Cheran Ratnam

In the rapidly evolving landscape of enterprise AI, organizations often grapple with a common challenge: enabling large language models (LLMs) to interpret and respond to queries based on structured data, such as knowledge graphs, without necessitating frequent retraining as the data evolves.

A novel approach addresses this issue by integrating three key methodologies:

Ontology embeddings : Transform structured data into formats that LLMs can process, facilitating an understanding of relationships, hierarchies, and schema definitions within the data.
Chain-of-Thought prompting: Encourage LLMs to engage in step-by-step reasoning, enhancing their ability to navigate complex data structures and derive logical conclusions.
Retrieval-Augmented Generation (RAG): Equip models to retrieve pertinent information from databases or knowledge graphs prior to generating responses, ensuring that outputs are both accurate and contextually relevant.

By synergizing these techniques, organizations can develop more intelligent and efficient systems for querying knowledge graphs without the need for continuous model retraining.

Implementation Strategy

Combining Ontology Embeddings with Chain-of-Thought Prompting: This fusion allows LLMs to grasp structured knowledge and reason through it methodically, which is particularly beneficial when dealing with intricate data relationships.
Integrating within a RAG Framework: Traditionally used for unstructured data, RAG can be adapted to retrieve relevant segments from knowledge graphs, providing LLMs with the necessary context for informed response generation.
Facilitating Zero/Few-Shot Reasoning: This approach minimizes the need for retraining by utilizing well-structured prompts, enabling LLMs to generalize across various datasets and schemas effectively.

Organizational Benefits

Adopting this methodology offers several advantages:

Reduced Need for Retraining: Systems can adapt to evolving data without the overhead of continuous model updates.
Enhanced Explainability: The step-by-step reasoning process provides transparency in AI-driven decisions.
Improved Performance with Complex Data: The model’s ability to comprehend and navigate structured data leads to more accurate responses.
Adaptability to Schema Changes: The system remains resilient amidst modifications in data structures.
Efficient Deployment Across Domains: LLMs can be utilized across various sectors without domain-specific fine-tuning.

Practical Applications

This approach has been successfully implemented in large-scale systems, such as the Dutch national cadastral knowledge graph (Kadaster), demonstrating its viability in real-world scenarios. For instance, deploying a chatbot capable of:

Understanding domain-specific relationships without explicit programming.
Updating its knowledge base in tandem with data evolution.
Operating seamlessly across departments with diverse taxonomies.
Delivering transparent and traceable answers in critical domains.

Conclusion

By integrating ontology-aware prompting, systematic reasoning, and retrieval-enhanced generation, organizations can develop AI systems that interact with structured data more effectively. This strategy not only streamlines the process but also enhances the reliability and adaptability of AI applications in data-intensive industries. For a comprehensive exploration of this methodology, refer to Bolin Huang’s Master’s thesis.

A visual representation of a Knowledge Graph Question Answering (KGQA) framework that integrates ontology embeddings, Chain-of-Thought prompting, and Retrieval-Augmented Generation (RAG). The diagram shows the flow from user query to LLM reasoning and response generation based on structured data from a knowledge graph.

"Comparison of traditional time series models like ARIMA with foundation models like TimesFM, SigLLM, and GPT-based anomaly detection approaches"

Academic Writing, Data Science, My Digital Universe, Portfolio

Time Series + LLMs: Hype or Breakthrough?

March 31, 2025March 31, 2025 Cheran Ratnam

Time series foundational models like UniTS and TimesFM are trained on massive, diverse datasets and show promising results in anomaly detection. Surprisingly, even general-purpose LLMs (like GPT) can detect anomalies effectively; without any domain-specific pretraining.

But here’s the reality check:

🔹 LLMs are not always superior to traditional models like ARIMA. In fact, classical statistical models still outperform them in some cases—especially when data is clean and patterns are well-understood.

🔹 Pretrained pipelines like Orion reduce the cost of training from scratch, enabling faster deployment. However, real-time efficiency remains a challenge.

🔹 SigLLM, which converts time series into text for LLM input, is innovative—but rolling window representations make it computationally expensive.

🔹 Despite limitations like context window size and slow inference, LLMs are still flexible enough to be competitive. But they don’t consistently outperform classical models across the board.

👉 The bottom line: LLMs are not a silver bullet. The most effective strategy is often hybrid, combining classical statistical techniques with foundation model strengths.

Are LLMs the future of time series modeling—or just another wave of AI hype?

Let’s discuss.

#AI #TimeSeries #AnomalyDetection #LLMs #FoundationModels
📄 Thesis by Linh K. Nguyen (MIT)

Animated comparison of regression models showing bias, variance, and prediction accuracy

Academic Writing, Data Science, My Digital Universe, Portfolio

🎢 Bias, Variance & the Great Regressor Showdown

March 31, 2025March 31, 2025 Cheran Ratnam

Imagine throwing five regressors into the same ring, giving them the same dataset, and watching them wrestle with reality. That’s what this animation is all about: a visual deep dive into bias, variance, and model complexity—without the textbook-level headache.

The models in play

Five well-known regression models, one smooth sinusoidal target function, and a bunch of noisy data points:

Linear Regression – The straight-line enthusiast.
Decision Tree – Thinks in boxes, and sometimes forgets how to generalize.
Random Forest – The chill ensemble kid who smooths out the chaos.
XGBoost – The overachiever with a calculator and an ego.
KNN – Your nosy neighbor who always asks, “What are your closest friends doing?”

🎥 The Animation:

The Concepts in Play

🎯 Bias

Bias refers to the error introduced when a model makes overly simplistic assumptions about the data. In other words, it is what happens when the model is too rigid or inflexible to capture the true patterns.

Take Linear Regression for example:

“Let’s pretend everything is a straight line.”

That assumption may work in some cases, but when the data contains curves or more complex relationships, the model cannot adapt. This leads to underfitting, where the model performs poorly on both training and test data because it has failed to learn the underlying structure.

🎢 Variance

Variance measures how sensitive a model is to fluctuations or noise in the training data. A high variance model learns the data too well, including all the random quirks and outliers, which means it performs well on the training set but poorly on new data.

This is typical of models like Decision Trees and KNN:

“I will memorize your quirks and your noise.”

These models often produce excellent training scores but fall apart during testing. That gap in performance is a red flag for overfitting, where the model has essentially memorized instead of generalized.

🤹 Model Complexity

Model complexity describes how flexible a model is in fitting the data. A more complex model can capture intricate patterns and relationships, but that flexibility comes at a cost.

More complexity often means the model has a higher risk of chasing noise rather than signal. It may give impressive training performance but fail when deployed in the real world. Complex models also tend to be harder to interpret and require more data to train effectively.

So while complexity sounds appealing, it is not always the answer. Sometimes the simplest model, with fewer moving parts, delivers the most reliable results.

💡 What We Learn from the GIF

Linear Regression has high bias. It’s smooth but can’t capture curves.
Decision Tree slices the data too rigidly. Prone to overfitting.
Random Forest balances bias and variance quite well (💪).
XGBoost tries to win—but often needs careful tuning to avoid bias.
KNN loves to follow the data closely, sometimes too closely.

Why This Matters (a lot)

In the real world:

Underfitting leads to useless predictions.
Overfitting leads to confident but wrong predictions.
Balanced models win in production.

Understanding the bias-variance tradeoff helps you:

✅ Pick the right model
✅ Avoid overcomplicating
✅ Diagnose errors
✅ Not trust every “98% R² score” you see on LinkedIn

Final Thoughts

Model performance isn’t magic—it’s tradeoffs. Sometimes the simplest model wins because the data is solid. Other times, the fanciest algorithm trips on its own complexity.

📩 Which model do you think performed best here?
Hit me up with your thoughts or overfitting horror stories.

#MachineLearning #BiasVariance #RegressionModels #ModelComplexity #XGBoost #RandomForest #KNN #Overfitting #Underfitting #DataScience

Comparing Regression Models: Ames vs California Housing Dataset Performance

Data Science, My Digital Universe

Model performance isn’t just about complexity; it’s about context.

March 31, 2025March 31, 2025 Cheran Ratnam

So, I ran a little experiment—because what’s life without overfitting just for fun? I compared five regression models on two very different housing datasets …

🏘️ Ames Housing:

Rich, detailed, and multi-dimensional. Think of it like the Swiss Army knife of regression datasets.

🌴 California Housing:

Simplified down to a single feature — Median Income. Basically, the minimalist’s dream.

Models Compared:

Linear Regression
Decision Tree
Random Forest
XGBoost
K-Nearest Neighbors (KNN)

Each GIF below shows how performance evolved over time. We’re talking train vs. test R² scores, visible over iterations—plus those visual cues you love that scream “Hey, this one’s overfitting!” or “Yeah… this one’s basically guessing.”

Ames Housing (Multi-Feature)

➡️ Insights:

Random Forest flexed its muscles here, but KNN and XGBoost? Classic cases of either overfitting or just not showing up to work.
Linear Regression held its own—shocking, I know—thanks to the strength of the underlying features.

California Housing (1 Feature: Median Income)

➡️ Insights:

When you only have one strong feature, even simple models like Linear Regression can outperform fancier methods.
Most complex models struggled with generalization. XGBoost in particular had a rough day.

💡 Takeaways:

✔️ Model choice matters; but data quality and feature strength matter more
✔️ Overfitting is real. Watch those training R² scores spike while test performance nosedives
✔️ KNN is great… if you enjoy chaos
✔️ Don’t blindly trust complexity to save the day

So, which model would you bet on in each case?

📩 Drop your thoughts in the comments or connect with me on LinkedIn. Always down to talk shop (or rant about why XGBoost occasionally betrays us).

#MachineLearning #ModelEvaluation #DataScience #RegressionModels #XGBoost #RandomForest #ModelOverfitting #HousingData #AmesHousing #CaliforniaHousing #KNN #LinearRegression

Streamlit logistics dashboard with pricing estimator and routing tool

Data Science, My Digital Universe, Portfolio

The Ultimate Logistics Dashboard: Pricing, Routing, and Freight Insights All in One

March 31, 2025March 31, 2025 Cheran Ratnam

Navigating the labyrinth of logistics can often feel like assembling IKEA furniture without the manual—frustrating, time-consuming, and occasionally resulting in a piece that looks nothing like the picture. After wrestling with geography, constraints, and some very opinionated algorithms, I built a dashboard that now supports multi-stop route optimization—up to well many destinations as you want … well, if you are patient (because pushing it to 6 might melt the Streamlit servers… ask me how I know).

Why You’ll Love It:

Multi-Stop Planning: Handle up to as many destinations in one go. It’s like having a personal assistant who doesn’t require coffee breaks.
Fuel Stop Integration: Automatically adds fuel stops when needed, so your trucks won’t run on fumes. Because pushing a semi-truck to the next station isn’t a workout anyone wants.
Efficiency at Its Core: Optimizes for fuel efficiency, distance, and delivery sequence, ensuring your routes are as lean as a marathon runner on a kale diet.

Under the Hood:

Powered by a constrained optimization model using Gurobi, our dashboard calculates the most cost-effective routes by considering distance, fuel costs, and truck load limits. It’s like having Einstein as your co-pilot, minus the wild hair.

Let’s face it—logistics planning is often less “fast and furious” and more “slow and suspicious.” But fear not, fellow freight folks. The Route Optimizer Dashboard is here to inject a dose of caffeine into your routing workflow (no judgment if that’s your third cup today).

The Dashy Dashy Dashboard 🙂

This dashboard isn’t just a pretty face. It’s divided into three slick tabs:

📦 Pricing Estimator – Give it your shipment details, and it’ll throw back a fee estimate faster than your intern can Google “freight cost per mile.”
🚚 Route Optimizer – Plug in your origin and your destinations, and it’ll find the most fuel-efficient route. It even adds fuel stops when needed. Because running out of gas mid-delivery is not a vibe.
📈 Dashboard – Visualize freight flows, trends, and performance. Yes, we made charts sexy again.

🛠️ How It Works

Input Your Destinations:
Enter as many stops as your freight-loving heart desires (and breathe). But let’s keep expectations real—it’s hosted on free Streamlit Cloud, so if you enter half a dozen points, give it 5-10 seconds to do its thing. Perfect time to sip your coffee and contemplate the mysteries of route efficiency.

Review the Suggested Route:
The backend optimizer (powered by Gurobi) will crunch fuel costs, distances, and stop sequences like a logistics nerd on Red Bull. The output? A smart route ready for action.

Hit the Road:
With your optimized plan in hand, your drivers can focus on the journey—not toggling between Google Maps and guesswork. By the way, I used the alternative fuel stations dataset for this so don’t be alarmed if sometimes the fuel station suggestion is missing or have some funky name … most likely those funky stations are electric. The data is real but I had to work with what was available.

✨Final Thoughts

In the world of logistics, time is money, and efficiency is the name of the game. Our Route Optimizer Dashboard ensures you’re not just playing—you’re winning.

This dashboard is for the data-minded logistics folks who want to stop reacting and start optimizing. Whether you’re pricing a load, planning a route, or just want to admire some interactive charts, it’s all there.

Try it out here: logistics-kk34nzr4hekiwm2tpxhrmx.streamlit.app

And hey, if it saves you even one angry client call about late deliveries, I’ll consider my job done. Need help integrating something similar for your operation? Let’s chat. This stuff’s kinda my thing.
✉️ Connect with me on LinkedIn

#Logistics #RouteOptimization #Efficiency #FuelSavings #FreightTech #SupplyChain #SmartRouting