quarter04-assignment-1

AI History Timeline and Key Milestones

Pre-AI Era: Mathematical and Theoretical Foundations (Before 1950)

1940s

1943: Warren McCulloch and Walter Pitts publish “A Logical Calculus of Ideas Immanent in Nervous Activity” - First mathematical model of artificial neurons
1945: Vannevar Bush proposes the Memex, an early concept for information retrieval systems
1948: Claude Shannon publishes “A Mathematical Theory of Communication” - Foundation of information theory
1949: Donald Hebb introduces Hebbian learning in “The Organization of Behavior”

Birth of AI (1950-1960)

1950

Alan Turing Test: “Computing Machinery and Intelligence” - Can machines think?
Turing proposes the imitation game (Turing Test) as a measure of machine intelligence

1956

Dartmouth Conference: John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon organize the Dartmouth Summer Research Project on Artificial Intelligence
Term “Artificial Intelligence” coined by John McCarthy
Birth of AI as an academic discipline

1957-1958

Perceptron: Frank Rosenblatt develops the perceptron algorithm
Logic Theorist: Allen Newell and Herbert Simon create one of the first AI programs
General Problem Solver (GPS): Newell and Simon’s attempt at a universal problem solver

Early Optimism and First Achievements (1960-1974)

1960s

1961: First industrial robot “Unimate” begins working at General Motors
1964: ELIZA chatbot by Joseph Weizenbaum demonstrates natural language processing
1965: First expert system - DENDRAL for chemical analysis
1966: ALPAC report leads to reduced funding for machine translation
1969: Perceptrons book by Minsky and Papert shows limitations of single-layer perceptrons

1970-1974

1970: First backpropagation ideas by Seppo Linnainmaa
1972: MYCIN expert system for medical diagnosis
1973: First mobile robot - Shakey at Stanford Research Institute

First AI Winter (1974-1980)

Causes

Computational limitations
Lack of data
Overpromising and underdelivering
Limited funding from governments

Key Issues

Combinatorial explosion problem
Brittleness of expert systems
Lack of common sense reasoning

Expert Systems Era (1980-1987)

1980s Boom

1980: First commercial expert system - XCON at Digital Equipment Corporation
1982: Japan announces Fifth Generation Computer Systems project
1983: Symbolics and LMI IPO - First AI companies go public
1986: Backpropagation algorithm popularized by Rumelhart, Hinton, and Williams

Key Developments

Knowledge engineering becomes established field
Rule-based systems proliferate
AI market reaches $1 billion

Second AI Winter (1987-1993)

Causes

Expert systems proved brittle and difficult to maintain
Hardware market collapse
Competition from cheaper alternatives

Statistical Approaches and Machine Learning Renaissance (1990-2010)

1990s

1991: Autonomous vehicle by Carnegie Mellon navigates 2,800 miles
1993: Judea Pearl’s work on probabilistic reasoning
1995: Random forests algorithm by Tin Kam Ho
1997: Deep Blue defeats world chess champion Garry Kasparov

2000s

2001: Support Vector Machines gain popularity
2006: Geoffrey Hinton coins “deep learning” and shows deep networks can be trained
2009: ImageNet dataset created by Fei-Fei Li

Deep Learning Revolution (2010-2017)

2010-2012

2010: First GPU implementations of deep learning
2011: IBM Watson wins Jeopardy!
2012: AlexNet wins ImageNet competition - Deep learning breakthrough

2013-2016

2013: Word2Vec introduces word embeddings
2014: Generative Adversarial Networks (GANs) by Ian Goodfellow
2015: ResNet solves vanishing gradient problem
2016: AlphaGo defeats world Go champion Lee Sedol

Transformer Era and LLM Revolution (2017-Present)

2017: The Transformer Revolution

“Attention Is All You Need” paper by Vaswani et al. introduces Transformer architecture
Self-attention mechanism revolutionizes sequence modeling
End of RNN/LSTM dominance in NLP

2018-2019: BERT and GPT Emergence

2018: BERT (Bidirectional Encoder Representations from Transformers) by Google
2018: GPT-1 by OpenAI (117M parameters)
2019: GPT-2 by OpenAI (1.5B parameters) - Initially withheld due to safety concerns

2020-2021: Scaling Up

2020: GPT-3 by OpenAI (175B parameters) - Massive scale breakthrough
2020: T5 (Text-to-Text Transfer Transformer) by Google
2021: CLIP by OpenAI - Multimodal understanding
2021: Codex/GitHub Copilot - AI for programming

2022-2023: LLM Democratization

2022: ChatGPT launched (November 30) - Reaches 100M users in 2 months
2022: GPT-3.5 and GPT-4 development
2023: GPT-4 released - Multimodal capabilities
2023: LLaMA by Meta - Open source large models
2023: Claude by Anthropic - Constitutional AI
2023: Bard by Google - LaMDA-based conversational AI

Agentic AI Era (2023-Present)

2023: Agent Capabilities Emerge

Tool use: LLMs learn to use external tools and APIs
Function calling: Structured interaction with external systems
Multi-step reasoning: Chain-of-thought and complex problem solving
Code interpreter: Direct code execution capabilities

2024-2025: Advanced Agent Systems

Multi-agent frameworks: Multiple AI agents working together
Autonomous planning: AI systems that can plan and execute complex tasks
Mixture of Experts (MoE): Specialized expert models for different domains
Context engineering: Advanced prompt and context management

Key Paradigm Shifts

Symbolic to Statistical (1990s): From rule-based to data-driven approaches
Shallow to Deep (2010s): From hand-crafted features to deep learning
Narrow to General (2020s): From task-specific to general-purpose models
Passive to Agentic (2023+): From reactive to proactive AI systems

Major Technological Enablers

Hardware Evolution

CPUs to GPUs: Parallel processing for neural networks
TPUs: Google’s Tensor Processing Units for AI workloads
Cloud computing: Scalable infrastructure for training large models

Data Revolution

Internet growth: Massive text datasets available
Digitization: More human knowledge in digital format
Data preprocessing: Better techniques for cleaning and preparing data

Algorithmic Breakthroughs

Backpropagation: Efficient training of neural networks
Attention mechanisms: Improved sequence modeling
Transfer learning: Pre-training and fine-tuning paradigm
Scaling laws: Understanding how performance scales with model size

Current State (2025)

Leading Models

GPT-5: OpenAI’s latest with ~2T parameters (speculated MoE)
Gemini 2.5 Pro: Google’s advanced multimodal model
Claude 4: Anthropic’s constitutional AI approach
Grok 4: xAI’s multi-agent architecture model
DeepSeek-V3: Cost-effective MoE model (671B total, 37B active parameters)

Key Capabilities

Multimodal understanding: Text, images, audio, video
Long context: Million+ token context windows
Tool integration: Seamless API and tool usage
Code generation: Advanced programming assistance
Reasoning: Complex multi-step problem solving

Future Directions

Technical Challenges

AGI development: Path to Artificial General Intelligence
Alignment: Ensuring AI systems remain beneficial
Efficiency: Reducing computational requirements
Interpretability: Understanding how AI systems make decisions

Societal Impact

Automation: Impact on jobs and economy
Education: Transformation of learning and teaching
Creative industries: AI in art, writing, and media
Scientific discovery: AI-accelerated research