CIS 5470: Software Analysis

Fall 2025 • University of Pennsylvania


📋 Course Information

Instructor

Prof. Mayur Naik
📧 mhnaik@seas.upenn.edu
🕐 Office Hours: TBA
📍 Location: AGH 642

Teaching Assistants

Zain Aamer
📧 zaamer@seas.upenn.edu
🕐 Office Hours: Tues 3-4pm
📍 Location: Levine 501 bump space

Mayank Keoliya
📧 mkeoliya@seas.upenn.edu
🕐 Office Hours: By Appointment
📍 Location: AGH 642

Lectures

📅 Monday & Wednesday
🕐 1:45pm - 3:15pm
📍 AGH 203


📅 Course Schedule


WeekDatesTopicLabDue
1Aug 27Introduction to Software AnalysisLab 1: Introduction to Software Analysis-
2Sep 3The LLVM FrameworkLab 2: The LLVM FrameworkLab 1
3Sep 8, 10Software SpecificationsLab 3: Random TestingLab 2
4Sep 15, 17Random Testing--
5Sep 22, 24Delta DebuggingLab 4: Delta DebuggingLab 3
6Sep 29, Oct 1Statistical DebuggingLab 5: Statistical DebuggingLab 4
7Oct 6, 8Dataflow Analysis ILab 6: Dataflow AnalysisLab 5
8Oct 13, 15Fall Break (Oct 9-12) / Dataflow Analysis II--
9Oct 20, 22Pointer AnalysisLab 7: Pointer AnalysisLab 6
10Oct 27, 29Constraint-Based AnalysisLab 8: Constraint-Based AnalysisLab 7
11Nov 3, 5Dynamic Symbolic Execution-Lab 8
12Nov 10, 12Automated Test Generation--
13Nov 17, 19Type Systems--
14Nov 24Thanksgiving Break (Nov 27-30)--
15Dec 1, 3Course Review and Wrap-Up / Final Exam--
16Dec 8Project Presentations-Group Project

📚 Course Description

Your 500-line vibe-coded class project works perfectly. Google’s 100+ million line codebase? That’s a different universe.

At scale, software is complex, buggy, and insecure. Enter software analysis: a suite of techniques to automatically analyze code, uncover bugs, and ensure reliability. And this has real-world impact: when Google deploys to billions of devices, a single divide-by-zero error can drain millions of batteries worldwide – or worse, crash a warship or rocket’s propulsion system. Software analys tools are live: Meta’s Infer has prevented thousands of crashes has prevented thousands of crashes, while Google’s Tricoder fixes 5000+ bugs daily, to name a few.

This course provides a rigorous and hands-on introduction to the field of software analysis — a body of powerful techniques and tools for analyzing modern software, with applications to:

  • 🐛 Systematically uncover insidious bugs
  • 🔒 Prevent security vulnerabilities
  • ⚙️ Automate testing and debugging
  • ✅ Improve confidence in software behavior, even mathematically

⚠️ New: Starting this semester, we’ll also address the trillion-parameter elephant in the room: Large Language Models (LLMs). With LLMs writing more vibe-code than ever, it’s important to devise automatic ways of ensuring code doesn’t blow up in production. We’ll explore how LLMs can assist in software analysis tasks and their limitations. Our team is re-working the labs as we go, so bear with us!


Topics Covered

Dynamic Analysis

  • Random testing & fuzzing
  • Delta debugging
  • Statistical debugging
  • Runtime monitoring

Static Analysis

  • Dataflow analysis
  • Pointer analysis
  • Type systems
  • Constraint-based analysis

All topics include hands-on implementation using the LLVM compiler infrastructure. LLVM, created by Chris Lattner during his UIUC PhD, powers modern compiler technology. His work led to Clang, caught Apple’s attention, and enabled Swift’s development. Today LLVM underlies Apple’s toolchain, Google’s optimizations, and Meta’s production tools—making it ideal for understanding real-world analysis.


🎯 Learning Objectives

Upon completion of this course, you will be able to:

Understand fundamental methods for analyzing, testing, and verifying software
Analyze trade-offs between different techniques (scalability vs. precision)
Implement analysis algorithms using LLVM
Apply appropriate techniques to real-world problems
Evaluate the effectiveness of different approaches


📋 Prerequisites

  • CIS 240/CIT 595: Systems Programming (C/C++ required)
  • CIS 120/CIT 594: Data Structures and Algorithms
  • CIS 160/CIT 592: Mathematical Foundations

⚠️ Note: Labs involve substantial C++ programming with LLVM


📊 Grading

ComponentWeight
Labs (upto 9)54%
Quizzes3%
Group Project23%
Final Exam20%

Late Policy: 6 late days total


📖 Resources

Textbooks

  • No required textbook - All materials provided
  • Recommended: Static Program Analysis (free online)
  • Reference: Principles of Program Analysis (Nielson et al.)

⚖️ Academic Integrity

All submitted work must be your own. You may discuss concepts, but code must be written independently.

AI Policy: ChatGPT/Copilot allowed for understanding concepts only - no direct code generation. Must disclose usage.

Violations → Failing grade + referral to OSC


Back to top

© 2025 University of Pennsylvania. All rights reserved.