CIS 5470: Software Analysis

Fall 2025 • University of Pennsylvania


📋 Course Information

Instructor

Prof. Mayur Naik
📍 AGH 642
🕐 Office Hours: TBA 📧 mhnaik@seas.upenn.edu

Teaching Assistants

Mayank Keoliya
📧 mkeoliya@seas.upenn.edu

Zain Aamer
📧 zaamer@seas.upenn.edu

📍 TA Office: AGH 642
🕐 TA Hours: By Appointment

Lectures

📅 Monday & Wednesday
🕐 1:45pm - 3:15pm
📍 AGH 203


📅 Course Schedule


WeekDatesTopicLabDue
1Aug 27Introduction to Software AnalysisLab 1: Introduction to Software Analysis-
2Sep 3The LLVM FrameworkLab 2: The LLVM FrameworkLab 1
3Sep 8, 10Random Input GenerationLab 3: Random Input GenerationLab 2
4Sep 15, 17Automated Test GenerationLab 4: Delta DebuggingLab 3
5Sep 22, 24Delta DebuggingLab 5: Statistical DebuggingLab 4
6Sep 29, Oct 1Statistical DebuggingLab 6: Dataflow AnalysisLab 5
7Oct 6, 8Dataflow Analysis I-Lab 6
8Oct 13, 15Fall Break (Oct 9-12) / Dataflow Analysis IILab 7: Pointer Analysis-
9Oct 20, 22Pointer AnalysisLab 8: Constraint-Based AnalysisLab 7
10Oct 27, 29Constraint-Based AnalysisLab 9: Dynamic Symbolic ExecutionLab 8
11Nov 3, 5Type Inference-Lab 9
12Nov 10, 12Symbolic ExecutionGroup Project-
13Nov 17, 19Advanced Topics--
14Nov 24Thanksgiving Break (Nov 27-30)--
15Dec 1, 3Course Review & Project Presentations-Group Project
16Dec 8Last Day of Classes--
FinalsDec 11-18Final Exam Period--

📚 Course Description

Your 500-line vibe-coded class project works perfectly. Google’s 100+ million line codebase? That’s a different universe.

At scale, software is complex, buggy, and insecure. Enter software analysis: a suite of techniques to automatically analyze code, uncover bugs, and ensure reliability. And this has real-world impact: when Google deploys to billions of devices, a single divide-by-zero error can drain millions of batteries worldwide – or worse, crash a warship or rocket’s propulsion system. Software analys tools are live: Meta’s Infer has prevented thousands of crashes has prevented thousands of crashes, while Google’s Tricoder fixes 5000+ bugs daily, to name a few.

This course provides a rigorous and hands-on introduction to the field of software analysis — a body of powerful techniques and tools for analyzing modern software, with applications to:

  • 🐛 Systematically uncover insidious bugs
  • 🔒 Prevent security vulnerabilities
  • ⚙️ Automate testing and debugging
  • ✅ Improve confidence in software behavior, even mathematically

⚠️ New: Starting this semester, we’ll also address the trillion-parameter elephant in the room: Large Language Models (LLMs). With LLMs writing more vibe-code than ever, it’s important to devise automatic ways of ensuring code doesn’t blow up in production. We’ll explore how LLMs can assist in software analysis tasks and their limitations. Our team is re-working the labs as we go, so bear with us!


Topics Covered

Dynamic Analysis

  • Random testing & fuzzing
  • Delta debugging
  • Statistical debugging
  • Runtime monitoring

Static Analysis

  • Dataflow analysis
  • Pointer analysis
  • Type systems
  • Constraint-based analysis

All topics include hands-on implementation using the LLVM compiler infrastructure. LLVM, created by Chris Lattner during his UIUC PhD, powers modern compiler technology. His work led to Clang, caught Apple’s attention, and enabled Swift’s development. Today LLVM underlies Apple’s toolchain, Google’s optimizations, and Meta’s production tools—making it ideal for understanding real-world analysis.


🎯 Learning Objectives

Upon completion of this course, you will be able to:

Understand fundamental methods for analyzing, testing, and verifying software
Analyze trade-offs between different techniques (scalability vs. precision)
Implement analysis algorithms using LLVM
Apply appropriate techniques to real-world problems
Evaluate the effectiveness of different approaches


📋 Prerequisites

  • CIS 240/CIT 595: Systems Programming (C/C++ required)
  • CIS 120/CIT 594: Data Structures and Algorithms
  • CIS 160/CIT 592: Mathematical Foundations

⚠️ Note: Labs involve substantial C++ programming with LLVM


📊 Grading

ComponentWeight
Labs (upto 9)54%
Quizzes3%
Group Project23%
Final Exam20%

Late Policy: 6 late days total


📖 Resources

Textbooks

  • No required textbook - All materials provided
  • Recommended: Static Program Analysis (free online)
  • Reference: Principles of Program Analysis (Nielson et al.)

⚖️ Academic Integrity

All submitted work must be your own. You may discuss concepts, but code must be written independently.

AI Policy: ChatGPT/Copilot allowed for understanding concepts only - no direct code generation. Must disclose usage.

Violations → Failing grade + referral to OSC


Back to top

© 2025 University of Pennsylvania. All rights reserved.