A self-contained learning resource on Java threading fundamentals — concept notes, runnable code samples for every referenced file, and a hands-on lab to practice.
Fast-recall, hand-copyable cheat sheet. Diagrams are sketched as doodles.
doodle: book → [chef in box] → 2 little helpers inside same box
| Concurrency | Parallelism | |
|---|---|---|
| CPUs | 1 enough | needs many |
| Timing | interleaved | simultaneous |
| Idea | dealing with many | doing many |
doodle: 1 lane juggling = concurrency; 2 parallel lanes = parallelism
doodle: one box (process) with 3 arrows running in parallel inside it
doodle: two rows of beads (T1, T2) zig-zag interleaving along a time arrow →
| Method | Acts on |
|---|---|
start() |
new thread → run() |
sleep() |
current thread (pause) |
join() |
current waits for other |
doodle: NEW box → RUNNABLE circle with 3 side-pockets (blocked/waiting/timed) → END box
doodle: speed-vs-threads curve: rises, peaks at #cores, then flat/dips
These three words describe code at three different stages of life. A program is a static set of instructions stored on disk — it is not doing anything yet, just sitting there like a recipe book on a shelf. A process is an instance of that program actually running, and crucially it has its own memory space; think of a chef who has picked one recipe and is cooking it in the kitchen. A thread is the smallest unit of execution inside a process, and it shares the process's resources — like a helper working on a task from the same recipe, sharing the same kitchen and ingredients as the chef.
The intuition that ties them together is ownership of memory. Because a process has its own isolated memory, two processes cannot easily step on each other's data. Because threads live inside one process and share its memory, they are cheap to create and can communicate quickly — but that shared memory is also exactly what makes multithreading tricky, since two threads touching the same data can corrupt it.
Multitasking is the ability to make progress on multiple tasks concurrently using the CPU. In the form the slides describe, it involves running multiple independent processes — this is called multiprocessing — or multiple tasks concurrently. The operating system allocates CPU time to each task and, because a single CPU core can truly only execute one instruction stream at a time, it must switch between the different tasks rapidly. Everyday examples: running several applications at once on your computer, or several servers on a network.
The reasoning is that switching happens so fast (thousands of times a second) that to a human it feels simultaneous, even on a single core. [+] Each switch is called a "context switch": the CPU saves the state of the current task and loads the state of the next one. This switching is the conceptual seed of context-switching overhead that comes up again in multithreading.
These are the two ideas students most often confuse, so pin down the distinction precisely. Concurrency means an application is making progress on more than one task at the same time, or at least seemingly at the same time. A single CPU can be concurrent by rapidly interleaving tasks. Parallel execution is when a computer has more than one CPU or CPU core and genuinely makes progress on more than one task simultaneously. Parallelism (the broader term) means an application splits its work into smaller subtasks that can be processed in parallel — for instance on multiple CPUs at the exact same time.
The slides also name a middle case, parallel concurrent execution: threads are distributed across multiple CPUs, so threads on the same CPU run concurrently while threads on different CPUs run in parallel. The key requirement stated explicitly: to achieve true parallelism your application must have more than one thread running, and each thread must run on a separate CPU / core / GPU core.
| Concurrency | Parallelism | |
|---|---|---|
| Hardware | One core is enough | Needs multiple cores |
| Execution | Interleaved (time-sliced) | Simultaneous |
| One-liner [+] | Dealing with many things | Doing many things at once |
A single-threaded process executes one task at a time, sequentially, one after the
other. Everything runs on one thread, so all resources are managed there. The major
downside the slides highlight: if a task takes a long time, the process may become unresponsive
— there is no second thread free to respond to anything else meanwhile. The reference example is
SingleThreadedExample.java (see Code Samples → Single-Threaded).
The intuition is a single cashier serving a queue: simple and predictable, but one slow customer holds up everyone behind them. [+] This is exactly why a UI freezes ("Not Responding") when the main thread does heavy work — there is no other thread to keep the interface alive.
Multithreading means creating multiple threads within a single process so they can execute concurrently. It divides one process into multiple threads, gives each CPU time, and — like multitasking — relies on the CPU context-switching between threads. The slides note it can enhance computational power and is used, for example, to split a video-encoding task into multiple threads to achieve parallelism.
A multi-threaded process therefore executes multiple threads concurrently (and in parallel when cores
allow). Its defining properties from the deck: it can stay responsive by doing work in separate
threads; threads share resources like memory and file handles within the same process; it
requires context switching between threads, which adds some overhead; and it is suitable for
tasks that can be divided into parallel subtasks or that require concurrent execution. Reference
example: MultiThreadedExample.java (see Code Samples →
Multi-Threaded).
The demo runs two threads that each print numbers, with a sleep() between prints,
and asks: why isn't the output produced one thread fully before the other? The slide's
timeline shows Thread 1 printing N1, sleeping, printing N2, sleeping, printing N3 — while Thread 2 does
the same — and the two sequences interleave along the time axis rather than finishing in strict order.
The runnable version is in Code Samples → Multi-Threaded.
The reason: when a thread calls sleep() it gives up the CPU, so the scheduler is free to run
the other thread during that pause. Execution order across threads is therefore not guaranteed.
[+] Even without sleep(), the OS thread scheduler can
preempt and switch threads at almost any point, so you should never assume a particular interleaving. If
you need a guaranteed order, you must coordinate it explicitly (for example with join() or
synchronization).
Runnable is an interface used to define a task that can be executed by a
thread. It has a single method, run(), which contains the code to be executed.
You put your work inside run(), then hand the task to a thread to carry out.
The lifecycle-controlling methods from the slides:
start() — called on each thread to begin execution. [+] Internally the JVM creates a new call stack and invokes your
run() on it. Important distinction: calling run() directly would just
execute the code on the current thread (no new thread); only start() creates a real
thread.sleep(time) — suspends the current thread for the mentioned
time. It pauses whoever calls it.join() — pauses the execution of the current thread until the thread
on which join() was called has finished. In the example,
MultiThreadedExample calls join() on the NumberPrinter
thread, so the main code waits for NumberPrinter to complete before continuing.run,
start, sleep, join) are the minimum toolkit to launch work and
coordinate it. The two most-tested facts: start() vs run() (only start spawns a thread) and
what join() waits on (the target thread, blocking the caller). [+] Use case for join(): when the main thread must collect results only
after worker threads finish — e.g. wait for all matrix-row threads before summing the answer.Threads are not free, so we sometimes need to stop them. The deck lists the resources a thread consumes: memory and kernel resources, plus CPU cycles and cache memory. Two motivations for termination: a thread finished its work but the application is still running, so we want to clean up its resources; or a thread is misbehaving and we want to stop it. A critical rule: by default the application will not stop as long as at least one thread is still running.
The slides frame stopping a thread as interrupting it, and list when interruption is possible:
InterruptedException (reference:
ThreadInterruptExample.java). [+] Blocking methods like
sleep(), wait(), and join() throw this, so an interrupt can
break a thread out of waiting.ThreadExplicitInterruptExample.java). [+] Here the
thread periodically checks its interrupted status (e.g.
Thread.currentThread().isInterrupted()) and chooses to exit.Both files are written out in Code Samples → Interrupts.
interrupt() is a cooperative request, not a
forced kill — the target thread must be written to notice and respond to it. Java deliberately removed
the old Thread.stop() because force-killing a thread mid-operation could leave shared data
in a corrupt state. The "app won't exit while a thread runs" rule explains why a program sometimes seems
to hang after its main logic finishes: a stray thread is still alive.The deck shows this as a diagram only; [+] the following
are the standard Java thread states (as defined by Thread.State) that the diagram
represents:
| State [+] | Meaning |
|---|---|
NEW |
Thread object created but start() not yet called. |
RUNNABLE |
After start(); either running or ready and waiting for CPU. |
BLOCKED |
Waiting to acquire a lock/monitor held by another thread. |
WAITING |
Waiting indefinitely for another thread (e.g. join(), wait() with no
timeout). |
TIMED_WAITING |
Waiting for a set time (e.g. sleep(t), join(t), wait(t)).
|
TERMINATED |
run() has finished or the thread was stopped. |
[+] Flow: a thread starts at NEW, moves to RUNNABLE on
start(), and from RUNNABLE may bounce into BLOCKED, WAITING, or TIMED_WAITING and back as
it competes for locks or pauses, finally reaching TERMINATED when run() ends. It cannot be
restarted once TERMINATED. The runnable demo that prints these states live is in Code
Samples.
sleep() produces TIMED_WAITING, join() produces WAITING, lock
contention produces BLOCKED. [+] Interview classic: explaining the
difference between BLOCKED (waiting for a lock) and WAITING (waiting for a signal/another thread).The first hands-on activity multiplies a 1000×1000 matrix, first on a single thread, then in parallel. The parallelization idea posed by the slides: split the row multiplication into separate threads (e.g. four threads each handling a share of the rows), then wait for all the threads to complete before using the result. You then compare single-thread vs. multi-thread execution time for the same 1000×1000 matrix. Full runnable code is in Code Samples → Matrix.
Performance is measured with latency: defined in the deck as the time delay (usually in milliseconds) between the initiation of an action and the response or result of that action. [+] Lower latency is better. Splitting independent rows across threads is a good fit because each row's computation does not depend on the others, so they can run truly in parallel on multiple cores — and on a multi-core machine the multi-threaded version is typically faster.
join() (from item 7)
earns its keep.Parallelism is not free, and the image-inverter activity makes the costs visible. The
image inverter inverts any RGB image, achieving parallelism by splitting and processing the image
row-wise, where the number of splits equals the number of threads. Images come in five
qualities: ULTRA_LOW, VERY_LOW, LOW, MEDIUM, HIGH. You tinker by changing the image quality (in
FILE_SLUG) and numOfThreads, then measure latency. Runnable code is in Code Samples → Image Inverter.
The deck's observed conclusions:
The performance-analysis takeaway is stated as a rule: it is best when the number of threads
equals the number of available processors, obtained in Java via
Runtime.getRuntime().availableProcessors().
sleep() yields the CPU and the
scheduler can switch anytime; never assume an interleaving.Runnable.run() holds the work, start()
launches a thread, sleep() pauses the caller, join() makes the caller wait
for another thread.interrupt(), not a forced kill.Every file the notes refer to, written out and runnable. Compile a file with
javac File.java and run with java ClassName. All examples target plain Java 8+
with no external libraries.
Demonstrates the core weakness of a single thread: a long task blocks everything after it.
// One thread does everything in order. The "heavy" task // blocks the line after it until it fully completes. public class SingleThreadedExample { static void heavyTask() { System.out.println("Heavy task: started..."); long sum = 0; for (long i = 0; i < 2_000_000_000L; i++) { sum += i; // busy work to simulate a slow job } System.out.println("Heavy task: done. sum=" + sum); } public static void main(String[] args) { System.out.println("App started"); heavyTask(); // everything below WAITS for this System.out.println("This line only runs AFTER the heavy task"); System.out.println("App finished"); } }
App started Heavy task: started... Heavy task: done. sum=1999999999000000000 This line only runs AFTER the heavy task App finished
Takeaway: there is no second thread to do anything while heavyTask() runs. In
a UI app, this is the "Not Responding" freeze.
This single file covers items 5, 6, 7, and 9: a Runnable task (NumberPrinter),
two threads interleaving because of sleep(), join() making main wait, and a
peek at thread states.
// Runnable defines a TASK; the Thread runs it. class NumberPrinter implements Runnable { private final String name; NumberPrinter(String name) { this.name = name; } @Override public void run() { // the work — NEVER call this directly for (int i = 1; i <= 3; i++) { System.out.println(name + " -> N" + i); try { Thread.sleep(300); // pause CURRENT thread -> yields CPU } catch (InterruptedException e) { Thread.currentThread().interrupt(); return; } } } } public class MultiThreadedExample { public static void main(String[] args) throws InterruptedException { Thread t1 = new Thread(new NumberPrinter("Thread-1")); Thread t2 = new Thread(new NumberPrinter("Thread-2")); System.out.println("t1 state before start: " + t1.getState()); // NEW t1.start(); // start() spawns a real thread -> JVM calls run() t2.start(); System.out.println("t1 state after start: " + t1.getState()); // RUNNABLE // join(): main WAITS here until both finish before printing the last line t1.join(); t2.join(); System.out.println("t1 state after join: " + t1.getState()); // TERMINATED System.out.println("All threads done. Main continues."); } }
t1 state before start: NEW t1 state after start: RUNNABLE Thread-1 -> N1 Thread-2 -> N1 Thread-1 -> N2 Thread-2 -> N2 Thread-1 -> N3 Thread-2 -> N3 t1 state after join: TERMINATED All threads done. Main continues.
Why interleaved: each sleep(300) releases the CPU, letting the scheduler run
the other thread. Run it a few times — the interleaving changes. The "All threads done" line is
guaranteed last only because of join().
// Option 1 (preferred): implement Runnable, pass to a Thread Runnable task = () -> System.out.println("hi from " + Thread.currentThread().getName()); new Thread(task).start(); // Option 2: extend Thread (less flexible — uses up your one superclass) class MyThread extends Thread { public void run() { System.out.println("hi from MyThread"); } } new MyThread().start();
Two distinct interruption patterns the deck names.
InterruptedException// The worker is sleeping (blocked). interrupt() makes sleep() // throw InterruptedException, breaking it out of the wait. public class ThreadInterruptExample { public static void main(String[] args) throws InterruptedException { Thread worker = new Thread(() -> { try { System.out.println("Worker: sleeping for 10s..."); Thread.sleep(10_000); System.out.println("Worker: finished sleep (won't reach here)"); } catch (InterruptedException e) { System.out.println("Worker: interrupted while sleeping -> exiting"); } }); worker.start(); Thread.sleep(1_000); // let it start sleeping System.out.println("Main: sending interrupt"); worker.interrupt(); // polite request, not a kill worker.join(); System.out.println("Main: done"); } }
Worker: sleeping for 10s... Main: sending interrupt Worker: interrupted while sleeping -> exiting Main: done
// No blocking call to throw the exception. Instead the loop // polls its own interrupted flag and chooses to stop. public class ThreadExplicitInterruptExample { public static void main(String[] args) throws InterruptedException { Thread worker = new Thread(() -> { long count = 0; while (!Thread.currentThread().isInterrupted()) { count++; // tight CPU work if (count % 100_000_000L == 0) { System.out.println("Worker: still running, count=" + count); } } System.out.println("Worker: noticed interrupt flag -> stopping"); }); worker.start(); Thread.sleep(500); System.out.println("Main: requesting stop"); worker.interrupt(); // sets the flag the loop checks worker.join(); System.out.println("Main: done"); } }
Worker: still running, count=100000000 Worker: still running, count=200000000 Main: requesting stop Worker: noticed interrupt flag -> stopping Main: done
sleep) throwing InterruptedException; C2 has no blocking call, so it must
poll isInterrupted() itself. Both treat interrupt() as a cooperative request.
Multiplies two N×N matrices and times it. The multi-threaded version splits rows across threads and
uses join() to wait for all of them.
public class MatrixMultiplication { static final int N = 1000; // 1000 x 1000 // --- Single-threaded baseline --- static int[][] multiplySingle(int[][] a, int[][] b) { int[][] c = new int[N][N]; for (int i = 0; i < N; i++) computeRow(a, b, c, i); return c; } // --- Multi-threaded: split rows across `threads` workers --- static int[][] multiplyMulti(int[][] a, int[][] b, int threads) throws InterruptedException { int[][] c = new int[N][N]; Thread[] pool = new Thread[threads]; int chunk = (N + threads - 1) / threads; // rows per thread for (int t = 0; t < threads; t++) { final int start = t * chunk; final int end = Math.min(start + chunk, N); pool[t] = new Thread(() -> { for (int i = start; i < end; i++) computeRow(a, b, c, i); }); pool[t].start(); } for (Thread th : pool) th.join(); // WAIT for all rows return c; } static void computeRow(int[][] a, int[][] b, int[][] c, int i) { for (int j = 0; j < N; j++) { int sum = 0; for (int k = 0; k < N; k++) sum += a[i][k] * b[k][j]; c[i][j] = sum; } } public static void main(String[] args) throws InterruptedException { int[][] a = randomMatrix(), b = randomMatrix(); long t0 = System.currentTimeMillis(); multiplySingle(a, b); System.out.println("Single-thread latency: " + (System.currentTimeMillis() - t0) + " ms"); int cores = Runtime.getRuntime().availableProcessors(); long t1 = System.currentTimeMillis(); multiplyMulti(a, b, cores); System.out.println("Multi-thread latency (" + cores + " threads): " + (System.currentTimeMillis() - t1) + " ms"); } static int[][] randomMatrix() { int[][] m = new int[N][N]; for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) m[i][j] = (int) (Math.random() * 10); return m; } }
Single-thread latency: 4120 ms Multi-thread latency (8 threads): 760 ms
Each row is independent (no shared writes to the same cell), so this is "embarrassingly parallel" — the multi-thread version should be markedly faster on a multi-core machine.
Inverts an RGB image by splitting it into horizontal bands, one thread per band. Swap
FILE_SLUG (image size) and numOfThreads to observe where extra threads stop
helping.
import javax.imageio.ImageIO; import java.awt.image.BufferedImage; import java.io.File; public class ImageInverter { // quality levels from the deck: ULTRA_LOW, VERY_LOW, LOW, MEDIUM, HIGH static final String FILE_SLUG = "MEDIUM"; // <- tinker static final int numOfThreads = 4; // <- tinker static void invertBand(BufferedImage img, int yStart, int yEnd) { for (int y = yStart; y < yEnd; y++) { for (int x = 0; x < img.getWidth(); x++) { int rgb = img.getRGB(x, y); int a = (rgb >> 24) & 0xff; int r = 255 - ((rgb >> 16) & 0xff); int g = 255 - ((rgb >> 8) & 0xff); int bl = 255 - (rgb & 0xff); img.setRGB(x, y, (a << 24) | (r << 16) | (g << 8) | bl); } } } public static void main(String[] args) throws Exception { BufferedImage img = ImageIO.read(new File("input_" + FILE_SLUG + ".png")); int h = img.getHeight(); int band = (h + numOfThreads - 1) / numOfThreads; Thread[] pool = new Thread[numOfThreads]; long t0 = System.currentTimeMillis(); for (int t = 0; t < numOfThreads; t++) { final int yStart = t * band; final int yEnd = Math.min(yStart + band, h); pool[t] = new Thread(() -> invertBand(img, yStart, yEnd)); pool[t].start(); } for (Thread th : pool) th.join(); // aggregate: wait for all bands System.out.println("Inverted " + FILE_SLUG + " with " + numOfThreads + " threads in " + (System.currentTimeMillis() - t0) + " ms"); System.out.println("Cores available: " + Runtime.getRuntime().availableProcessors()); ImageIO.write(img, "png", new File("output_" + FILE_SLUG + ".png")); } }
Inverted MEDIUM with 4 threads in 38 ms Cores available: 8
numOfThreads
= 1, 2, 4, 8, 16, 32. Latency drops until you hit roughly your core count, then flattens or worsens. On
a small image (ULTRA_LOW), 1 thread can beat 16 because thread-creation overhead dominates the tiny job.
Do these in order. Each exercise has a goal, steps, and a
collapsible solution — try first, then check. Everything runs with plain
javac / java; no libraries except Exercise 6.
java -version and javac -version (need
JDK 8+).mkdir threads-lab && cd threads-labFileName.java, then compile + run:javac FileName.java then java ClassName
System.out.println(Runtime.getRuntime().availableProcessors());Goal: Launch two threads that each print their name 5 times, and prove the output interleaves.
Greeter that implements Runnable; in run()
loop 5 times printing name + " says hi #" + i.main, create two Thread objects wrapping two Greeters
("Alice", "Bob") and start() both.run() directly
instead of start()?class Greeter implements Runnable { private final String name; Greeter(String n) { name = n; } public void run() { for (int i = 1; i <= 5; i++) System.out.println(name + " says hi #" + i); } } public class Ex1 { public static void main(String[] a) { new Thread(new Greeter("Alice")).start(); new Thread(new Greeter("Bob")).start(); } }
Answer to the prediction: calling run() directly
runs the code on the main thread — no new thread, no interleaving. You'd see all of
Alice then all of Bob, in order. Only start() spawns a thread.
Goal: Print "DONE" only after a worker thread truly
finishes — first without join() (buggy), then fix it.
"work complete".main: start it, immediately print "DONE". Observe that "DONE"
prints before "work complete".worker.join() before printing "DONE" and re-run.public class Ex2 { public static void main(String[] a) throws InterruptedException { Thread worker = new Thread(() -> { try { Thread.sleep(1000); } catch (InterruptedException e) {} System.out.println("work complete"); }); worker.start(); worker.join(); // remove this line to see the bug System.out.println("DONE"); } }
Without join(), main races ahead and prints "DONE" first. With it,
main blocks until the worker's run() ends.
Goal: Print a thread's getState() at NEW, RUNNABLE,
TIMED_WAITING, and TERMINATED.
run() sleeps 500ms.NEW).start(), then print state (expect RUNNABLE).TIMED_WAITING).join(), then print state (expect TERMINATED).public class Ex3 { public static void main(String[] a) throws InterruptedException { Thread w = new Thread(() -> { try { Thread.sleep(500); } catch (InterruptedException e) {} }); System.out.println("NEW? " + w.getState()); w.start(); System.out.println("RUNNABLE? " + w.getState()); Thread.sleep(100); System.out.println("TIMED? " + w.getState()); w.join(); System.out.println("TERMINATED?" + w.getState()); } }
Goal: Write a counter thread that runs forever until interrupted, then stop it cleanly from main after 1 second.
while (!Thread.currentThread().isInterrupted()), incrementing a
counter.interrupt(), then join().sleep() inside the loop and catch
InterruptedException instead — compare the two stop mechanisms (C1 vs C2 above).
public class Ex4 { public static void main(String[] a) throws InterruptedException { Thread w = new Thread(() -> { long n = 0; while (!Thread.currentThread().isInterrupted()) n++; System.out.println("Stopped. counted to " + n); }); w.start(); Thread.sleep(1000); w.interrupt(); w.join(); } }
Pitfall to notice: if you swallow InterruptedException in a
sleep()-based loop without re-checking the flag, you can accidentally make a
thread un-interruptible. Always either exit or call
Thread.currentThread().interrupt() to restore the flag.
Goal: Sum a large array single-threaded, then split across N threads
with join(), and find where adding threads stops helping.
long[] of 100 million elements filled with 1 (so the answer is
known: 100,000,000).sumParallel(arr, threads): give each thread a slice, each computes a partial
sum into partials[t], join() all, then add the partials.availableProcessors().partials[t]
instead of a shared total variable?public class Ex5 { static final int N = 100_000_000; static long sumParallel(long[] arr, int threads) throws InterruptedException { Thread[] pool = new Thread[threads]; long[] partials = new long[threads]; // each thread owns one slot int chunk = (arr.length + threads - 1) / threads; for (int t = 0; t < threads; t++) { final int idx = t, s = t * chunk, e = Math.min(s + chunk, arr.length); pool[t] = new Thread(() -> { long sum = 0; for (int i = s; i < e; i++) sum += arr[i]; partials[idx] = sum; // no shared write contention }); pool[t].start(); } for (Thread th : pool) th.join(); long total = 0; for (long p : partials) total += p; // aggregate AFTER join return total; } public static void main(String[] a) throws InterruptedException { long[] arr = new long[N]; java.util.Arrays.fill(arr, 1); int cores = Runtime.getRuntime().availableProcessors(); System.out.println("cores = " + cores); for (int th : new int[]{1,2,4,8,16}) { long t0 = System.currentTimeMillis(); long sum = sumParallel(arr, th); System.out.println(th + " threads: " + (System.currentTimeMillis() - t0) + " ms (sum=" + sum + ")"); } } }
Answer: if every thread did total += partial on
one shared variable, the unsynchronized read-modify-write would race and lose updates — a
classic data race. Per-thread slots avoid contention entirely; you combine them safely only
after every thread has joined.
Goal: Reproduce Activity #2 end-to-end and measure the latency vs. thread-count curve on real images.
input_ULTRA_LOW.png …
input_HIGH.png.ImageInverter.java from Code Samples → E.FILE_SLUG = "HIGH", run with numOfThreads = 1, 2, 4, 8, 16, 32 and
record each latency.FILE_SLUG = "ULTRA_LOW".output_*.png to confirm colors are inverted.numOfThreads ≈ Runtime.getRuntime().availableProcessors().Stretch goal: replace the raw Thread[] with an
ExecutorService thread pool (Executors.newFixedThreadPool(cores))
and compare — pools reuse threads instead of recreating them per run.
join() guarantees,
(3) name which state each pause in Ex3 produces, (4) describe why interrupt() in Ex4 is
cooperative, and (5) explain the per-thread-partials trick in Ex5 in terms of race conditions.