Java · Concurrency

Intro to Threads

A self-contained learning resource on Java threading fundamentals — concept notes, runnable code samples for every referenced file, and a hands-on lab to practice.

Items marked [+] were added or corrected beyond the original slides. Code blocks marked runnable compile and run as-is on Java 8+.

I. Notebook Summary (copy by hand)

Fast-recall, hand-copyable cheat sheet. Diagrams are sketched as doodles.

1. Program / Process / Thread

Program → static instructions on disk (not running) = recipe book Process → 1 running instance of a program; OWN memory space = chef cooking one recipe Thread → smallest unit of execution INSIDE a process; SHARES process resources (mem, files) = helper sharing kitchen + ingredients Hierarchy: Program ⊃ Process ⊃ Thread(s) 1 Process → many Threads (share memory) 1 Program → many Processes (separate memory)

doodle: book → [chef in box] → 2 little helpers inside same box

2. Multitasking

Multitasking → run many tasks "at once" on CPU • many independent PROCESSES = Multiprocessing • CPU allocates time-slices per task • CPU SWITCHES between tasks (context switch) • e.g. many apps open / many servers on network

3. Concurrency vs Parallelism

Concurrency → progress on many tasks, seemingly at once (1 CPU, fast switching — interleaved) Parallelism → tasks split into subtasks, run AT SAME instant on MANY CPUs/cores Need: ≥1 thread per CPU for TRUE parallelism Combo: Parallel-Concurrent = threads spread over CPUs; same-CPU threads = concurrent, cross-CPU = parallel
Concurrency Parallelism
CPUs 1 enough needs many
Timing interleaved simultaneous
Idea dealing with many doing many

doodle: 1 lane juggling = concurrency; 2 parallel lanes = parallelism

4. Single-Threaded Process

Single-thread → 1 task at a time, SEQUENTIAL • long task → app FREEZES (unresponsive) • all resources in one thread • file: SingleThreadedExample.java

5. Multithreading / Multi-Threaded Process

Multithreading → many threads INSIDE one process • splits 1 process → concurrent threads • CPU context-switches between threads • threads SHARE memory + file handles • app stays RESPONSIVE (work in side threads) • overhead = context switching • good for: divisible / parallel subtasks • e.g. video encoding split across threads • file: MultiThreadedExample.java

doodle: one box (process) with 3 arrows running in parallel inside it

6. Why threads interleave (demo)

2 threads printing N1,N2,N3 with sleep() between Output NOT 1-by-1 → order interleaves / unpredictable WHY: sleep() releases CPU → scheduler runs other thread → no guaranteed order without join/sync

doodle: two rows of beads (T1, T2) zig-zag interleaving along a time arrow →

7. Core thread methods

Runnable → interface; defines a TASK for a thread single method run() = code to execute run() → holds the work (NOT called directly) start() → begins thread execution → JVM calls run() sleep(ms) → suspend CURRENT thread for given time join() → caller WAITS until target thread finishes (MultiThreadedExample waits for NumberPrinter) [+] interrupt() → signal a thread to stop
Method Acts on
start() new thread → run()
sleep() current thread (pause)
join() current waits for other

8. Thread Termination & Interrupt

Why kill threads? → they COST resources: • memory + kernel resources • CPU cycles + cache Reasons: finished work / misbehaving thread Rule: app keeps running while ≥1 thread alive Interrupt a thread WHEN: • it runs a method that throws InterruptedException → ThreadInterruptExample.java • its code handles interrupt signal explicitly → ThreadExplicitInterruptExample.java [+] interrupt() = polite request, not force-kill

9. Thread Life Cycle [+]

[+] States (Java Thread.State): NEW → object made, not started RUNNABLE → after start(); ready/running BLOCKED → waiting for a lock/monitor WAITING → join()/wait() no timeout TIMED_WAITING → sleep(t)/join(t)/wait(t) TERMINATED → run() finished or stopped flow: NEW →start()→ RUNNABLE ↔ (BLOCKED/WAITING/ TIMED_WAITING) → run() ends → TERMINATED

doodle: NEW box → RUNNABLE circle with 3 side-pockets (blocked/waiting/timed) → END box

10. Parallelization Performance

Latency → time delay (ms) between action start & result Activity: 1000×1000 matrix → 1 thread vs split rows → multi-thread usually FASTER (split row mult.) Image inverter → split image row-wise; #splits = #threads → quality levels: ULTRA_LOW..VERY_LOW..LOW..MEDIUM..HIGH GOLDEN RULE: best when #threads == #available processors Runtime.getRuntime().availableProcessors() Beyond #cores → too much context switch → NO gain / WORSE Low-quality / small job → single thread can WIN (thread-creation overhead > benefit) Cost of parallelism = thread creation + context switch + aggregating results

doodle: speed-vs-threads curve: rises, peaks at #cores, then flat/dips


II. Detailed Study Notes

1. Program vs. Process vs. Thread

These three words describe code at three different stages of life. A program is a static set of instructions stored on disk — it is not doing anything yet, just sitting there like a recipe book on a shelf. A process is an instance of that program actually running, and crucially it has its own memory space; think of a chef who has picked one recipe and is cooking it in the kitchen. A thread is the smallest unit of execution inside a process, and it shares the process's resources — like a helper working on a task from the same recipe, sharing the same kitchen and ingredients as the chef.

The intuition that ties them together is ownership of memory. Because a process has its own isolated memory, two processes cannot easily step on each other's data. Because threads live inside one process and share its memory, they are cheap to create and can communicate quickly — but that shared memory is also exactly what makes multithreading tricky, since two threads touching the same data can corrupt it.

Why it matters: almost every performance and correctness question in this topic comes back to "is this isolated (process) or shared (thread)?" [+] Real-world example: a web browser runs each tab as a separate process for isolation (a crashed tab doesn't kill the browser), while within one tab multiple threads handle rendering, networking, and JavaScript concurrently. [+] Common misconception: people say "thread" and "process" interchangeably — but threads share memory and processes do not, and that single difference drives almost everything else.

2. Multitasking

Multitasking is the ability to make progress on multiple tasks concurrently using the CPU. In the form the slides describe, it involves running multiple independent processes — this is called multiprocessing — or multiple tasks concurrently. The operating system allocates CPU time to each task and, because a single CPU core can truly only execute one instruction stream at a time, it must switch between the different tasks rapidly. Everyday examples: running several applications at once on your computer, or several servers on a network.

The reasoning is that switching happens so fast (thousands of times a second) that to a human it feels simultaneous, even on a single core. [+] Each switch is called a "context switch": the CPU saves the state of the current task and loads the state of the next one. This switching is the conceptual seed of context-switching overhead that comes up again in multithreading.

Why it matters: multitasking at the process level (multiprocessing) is the coarse-grained cousin of multithreading. Same goal — do more at once — but with isolated memory instead of shared memory.

3. Concurrency vs. Parallelism

These are the two ideas students most often confuse, so pin down the distinction precisely. Concurrency means an application is making progress on more than one task at the same time, or at least seemingly at the same time. A single CPU can be concurrent by rapidly interleaving tasks. Parallel execution is when a computer has more than one CPU or CPU core and genuinely makes progress on more than one task simultaneously. Parallelism (the broader term) means an application splits its work into smaller subtasks that can be processed in parallel — for instance on multiple CPUs at the exact same time.

The slides also name a middle case, parallel concurrent execution: threads are distributed across multiple CPUs, so threads on the same CPU run concurrently while threads on different CPUs run in parallel. The key requirement stated explicitly: to achieve true parallelism your application must have more than one thread running, and each thread must run on a separate CPU / core / GPU core.

Concurrency Parallelism
Hardware One core is enough Needs multiple cores
Execution Interleaved (time-sliced) Simultaneous
One-liner [+] Dealing with many things Doing many things at once
Why it matters & interview point: the crisp framing — "concurrency is about structure / dealing with many tasks; parallelism is about simultaneous execution of them" — is a classic interview answer. [+] You can have concurrency without parallelism (one core juggling tasks), and concurrency is the prerequisite that enables parallelism when extra cores exist. Common pitfall: assuming more threads automatically means parallel speedup — without enough cores, extra threads just add concurrency and overhead.

4. Single-Threaded Process

A single-threaded process executes one task at a time, sequentially, one after the other. Everything runs on one thread, so all resources are managed there. The major downside the slides highlight: if a task takes a long time, the process may become unresponsive — there is no second thread free to respond to anything else meanwhile. The reference example is SingleThreadedExample.java (see Code Samples → Single-Threaded).

The intuition is a single cashier serving a queue: simple and predictable, but one slow customer holds up everyone behind them. [+] This is exactly why a UI freezes ("Not Responding") when the main thread does heavy work — there is no other thread to keep the interface alive.

Trade-offs: single-threaded code is simpler to reason about and free of shared-data races, which is a real advantage. It only becomes a problem when work is long-running or when responsiveness matters.

5. Multithreading & the Multi-Threaded Process

Multithreading means creating multiple threads within a single process so they can execute concurrently. It divides one process into multiple threads, gives each CPU time, and — like multitasking — relies on the CPU context-switching between threads. The slides note it can enhance computational power and is used, for example, to split a video-encoding task into multiple threads to achieve parallelism.

A multi-threaded process therefore executes multiple threads concurrently (and in parallel when cores allow). Its defining properties from the deck: it can stay responsive by doing work in separate threads; threads share resources like memory and file handles within the same process; it requires context switching between threads, which adds some overhead; and it is suitable for tasks that can be divided into parallel subtasks or that require concurrent execution. Reference example: MultiThreadedExample.java (see Code Samples → Multi-Threaded).

Why it matters & trade-offs: multithreading is the practical payoff of this whole topic. The benefits — responsiveness and throughput — come bundled with two costs: context-switching overhead and the danger of shared mutable state. [+] Because threads share memory, two threads writing the same variable can produce a "race condition" with corrupted results; this is why synchronization (locks) exists, though the synchronization mechanics are beyond this session. [+] Real-world example: a server spins up a thread per request so one slow request doesn't block the others.

6. Code Demo: why threads don't run one-by-one

The demo runs two threads that each print numbers, with a sleep() between prints, and asks: why isn't the output produced one thread fully before the other? The slide's timeline shows Thread 1 printing N1, sleeping, printing N2, sleeping, printing N3 — while Thread 2 does the same — and the two sequences interleave along the time axis rather than finishing in strict order. The runnable version is in Code Samples → Multi-Threaded.

The reason: when a thread calls sleep() it gives up the CPU, so the scheduler is free to run the other thread during that pause. Execution order across threads is therefore not guaranteed. [+] Even without sleep(), the OS thread scheduler can preempt and switch threads at almost any point, so you should never assume a particular interleaving. If you need a guaranteed order, you must coordinate it explicitly (for example with join() or synchronization).

Why it matters: this single observation — "thread ordering is unpredictable" — is the root cause of most concurrency bugs and a favorite interview probe. Output that looks scrambled is normal, not a bug.

7. Runnable and the core thread methods

Runnable is an interface used to define a task that can be executed by a thread. It has a single method, run(), which contains the code to be executed. You put your work inside run(), then hand the task to a thread to carry out.

The lifecycle-controlling methods from the slides:

Why it matters: these four (run, start, sleep, join) are the minimum toolkit to launch work and coordinate it. The two most-tested facts: start() vs run() (only start spawns a thread) and what join() waits on (the target thread, blocking the caller). [+] Use case for join(): when the main thread must collect results only after worker threads finish — e.g. wait for all matrix-row threads before summing the answer.

8. Thread Termination & Interruption

Threads are not free, so we sometimes need to stop them. The deck lists the resources a thread consumes: memory and kernel resources, plus CPU cycles and cache memory. Two motivations for termination: a thread finished its work but the application is still running, so we want to clean up its resources; or a thread is misbehaving and we want to stop it. A critical rule: by default the application will not stop as long as at least one thread is still running.

The slides frame stopping a thread as interrupting it, and list when interruption is possible:

Both files are written out in Code Samples → Interrupts.

Why it matters & pitfall: [+] interrupt() is a cooperative request, not a forced kill — the target thread must be written to notice and respond to it. Java deliberately removed the old Thread.stop() because force-killing a thread mid-operation could leave shared data in a corrupt state. The "app won't exit while a thread runs" rule explains why a program sometimes seems to hang after its main logic finishes: a stray thread is still alive.

9. Thread Life Cycle [+]

The deck shows this as a diagram only; [+] the following are the standard Java thread states (as defined by Thread.State) that the diagram represents:

State [+] Meaning
NEW Thread object created but start() not yet called.
RUNNABLE After start(); either running or ready and waiting for CPU.
BLOCKED Waiting to acquire a lock/monitor held by another thread.
WAITING Waiting indefinitely for another thread (e.g. join(), wait() with no timeout).
TIMED_WAITING Waiting for a set time (e.g. sleep(t), join(t), wait(t)).
TERMINATED run() has finished or the thread was stopped.

[+] Flow: a thread starts at NEW, moves to RUNNABLE on start(), and from RUNNABLE may bounce into BLOCKED, WAITING, or TIMED_WAITING and back as it competes for locks or pauses, finally reaching TERMINATED when run() ends. It cannot be restarted once TERMINATED. The runnable demo that prints these states live is in Code Samples.

Why it matters: the life cycle ties the earlier methods together — sleep() produces TIMED_WAITING, join() produces WAITING, lock contention produces BLOCKED. [+] Interview classic: explaining the difference between BLOCKED (waiting for a lock) and WAITING (waiting for a signal/another thread).

10. Activity #1 — Matrix Multiplication & latency

The first hands-on activity multiplies a 1000×1000 matrix, first on a single thread, then in parallel. The parallelization idea posed by the slides: split the row multiplication into separate threads (e.g. four threads each handling a share of the rows), then wait for all the threads to complete before using the result. You then compare single-thread vs. multi-thread execution time for the same 1000×1000 matrix. Full runnable code is in Code Samples → Matrix.

Performance is measured with latency: defined in the deck as the time delay (usually in milliseconds) between the initiation of an action and the response or result of that action. [+] Lower latency is better. Splitting independent rows across threads is a good fit because each row's computation does not depend on the others, so they can run truly in parallel on multiple cores — and on a multi-core machine the multi-threaded version is typically faster.

Why it matters: matrix rows are the textbook example of an "embarrassingly parallel" problem — independent chunks of work with no shared state — which is the ideal case for threading. The "wait for all threads" step is exactly where join() (from item 7) earns its keep.

11. Cost of Parallelization & Activity #2 (Image Inverter)

Parallelism is not free, and the image-inverter activity makes the costs visible. The image inverter inverts any RGB image, achieving parallelism by splitting and processing the image row-wise, where the number of splits equals the number of threads. Images come in five qualities: ULTRA_LOW, VERY_LOW, LOW, MEDIUM, HIGH. You tinker by changing the image quality (in FILE_SLUG) and numOfThreads, then measure latency. Runnable code is in Code Samples → Image Inverter.

The deck's observed conclusions:

The performance-analysis takeaway is stated as a rule: it is best when the number of threads equals the number of available processors, obtained in Java via Runtime.getRuntime().availableProcessors().

Why it matters & the trade-off: this is the practical heart of the session. The benefit of more threads rises, peaks around the core count, then flattens or declines because the costs — thread creation overhead, context switching, and aggregating partial results — start to dominate. [+] This is the intuition behind thread pools, which reuse a fixed number of threads (sized to the cores) instead of creating new ones per task. Common pitfall: assuming "more threads = more speed." Beyond the core count it is the opposite.

12. Key Takeaways


III. Code Samples runnable

Every file the notes refer to, written out and runnable. Compile a file with javac File.java and run with java ClassName. All examples target plain Java 8+ with no external libraries.

A. Single-Threaded vs. responsiveness

Demonstrates the core weakness of a single thread: a long task blocks everything after it.

SingleThreadedExample.javajavac + java SingleThreadedExample
// One thread does everything in order. The "heavy" task
// blocks the line after it until it fully completes.
public class SingleThreadedExample {

    static void heavyTask() {
        System.out.println("Heavy task: started...");
        long sum = 0;
        for (long i = 0; i < 2_000_000_000L; i++) {
            sum += i;                 // busy work to simulate a slow job
        }
        System.out.println("Heavy task: done. sum=" + sum);
    }

    public static void main(String[] args) {
        System.out.println("App started");
        heavyTask();           // everything below WAITS for this
        System.out.println("This line only runs AFTER the heavy task");
        System.out.println("App finished");
    }
}
Expected output (strict order — that's the point)
App started
Heavy task: started...
Heavy task: done. sum=1999999999000000000
This line only runs AFTER the heavy task
App finished

Takeaway: there is no second thread to do anything while heavyTask() runs. In a UI app, this is the "Not Responding" freeze.

B. Multi-Threaded — Runnable, start/sleep/join, and interleaving

This single file covers items 5, 6, 7, and 9: a Runnable task (NumberPrinter), two threads interleaving because of sleep(), join() making main wait, and a peek at thread states.

MultiThreadedExample.javajavac + java MultiThreadedExample
// Runnable defines a TASK; the Thread runs it.
class NumberPrinter implements Runnable {
    private final String name;
    NumberPrinter(String name) { this.name = name; }

    @Override
    public void run() {                // the work — NEVER call this directly
        for (int i = 1; i <= 3; i++) {
            System.out.println(name + " -> N" + i);
            try {
                Thread.sleep(300);    // pause CURRENT thread -> yields CPU
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                return;
            }
        }
    }
}

public class MultiThreadedExample {
    public static void main(String[] args) throws InterruptedException {
        Thread t1 = new Thread(new NumberPrinter("Thread-1"));
        Thread t2 = new Thread(new NumberPrinter("Thread-2"));

        System.out.println("t1 state before start: " + t1.getState()); // NEW

        t1.start();     // start() spawns a real thread -> JVM calls run()
        t2.start();

        System.out.println("t1 state after start: " + t1.getState());  // RUNNABLE

        // join(): main WAITS here until both finish before printing the last line
        t1.join();
        t2.join();

        System.out.println("t1 state after join: " + t1.getState());   // TERMINATED
        System.out.println("All threads done. Main continues.");
    }
}
Sample output (order between Thread-1 / Thread-2 WILL vary run-to-run)
t1 state before start: NEW
t1 state after start: RUNNABLE
Thread-1 -> N1
Thread-2 -> N1
Thread-1 -> N2
Thread-2 -> N2
Thread-1 -> N3
Thread-2 -> N3
t1 state after join: TERMINATED
All threads done. Main continues.

Why interleaved: each sleep(300) releases the CPU, letting the scheduler run the other thread. Run it a few times — the interleaving changes. The "All threads done" line is guaranteed last only because of join().

Two ways to define a thread's work (reference)

snippet — Runnable vs extends Threadprefer Runnable
// Option 1 (preferred): implement Runnable, pass to a Thread
Runnable task = () -> System.out.println("hi from " + Thread.currentThread().getName());
new Thread(task).start();

// Option 2: extend Thread (less flexible — uses up your one superclass)
class MyThread extends Thread {
    public void run() { System.out.println("hi from MyThread"); }
}
new MyThread().start();

C. Thread Termination & Interruption

Two distinct interruption patterns the deck names.

C1. Interrupting a thread blocked in a method that throws InterruptedException

ThreadInterruptExample.javajavac + java ThreadInterruptExample
// The worker is sleeping (blocked). interrupt() makes sleep()
// throw InterruptedException, breaking it out of the wait.
public class ThreadInterruptExample {
    public static void main(String[] args) throws InterruptedException {
        Thread worker = new Thread(() -> {
            try {
                System.out.println("Worker: sleeping for 10s...");
                Thread.sleep(10_000);
                System.out.println("Worker: finished sleep (won't reach here)");
            } catch (InterruptedException e) {
                System.out.println("Worker: interrupted while sleeping -> exiting");
            }
        });

        worker.start();
        Thread.sleep(1_000);        // let it start sleeping
        System.out.println("Main: sending interrupt");
        worker.interrupt();          // polite request, not a kill
        worker.join();
        System.out.println("Main: done");
    }
}
Expected output
Worker: sleeping for 10s...
Main: sending interrupt
Worker: interrupted while sleeping -> exiting
Main: done

C2. A thread that handles the interrupt signal explicitly

ThreadExplicitInterruptExample.javajavac + java ThreadExplicitInterruptExample
// No blocking call to throw the exception. Instead the loop
// polls its own interrupted flag and chooses to stop.
public class ThreadExplicitInterruptExample {
    public static void main(String[] args) throws InterruptedException {
        Thread worker = new Thread(() -> {
            long count = 0;
            while (!Thread.currentThread().isInterrupted()) {
                count++;                       // tight CPU work
                if (count % 100_000_000L == 0) {
                    System.out.println("Worker: still running, count=" + count);
                }
            }
            System.out.println("Worker: noticed interrupt flag -> stopping");
        });

        worker.start();
        Thread.sleep(500);
        System.out.println("Main: requesting stop");
        worker.interrupt();              // sets the flag the loop checks
        worker.join();
        System.out.println("Main: done");
    }
}
Expected output (count values vary by machine speed)
Worker: still running, count=100000000
Worker: still running, count=200000000
Main: requesting stop
Worker: noticed interrupt flag -> stopping
Main: done
Key difference: C1 relies on a blocking method (sleep) throwing InterruptedException; C2 has no blocking call, so it must poll isInterrupted() itself. Both treat interrupt() as a cooperative request.

D. Activity #1 — Matrix Multiplication (single vs. multi-thread)

Multiplies two N×N matrices and times it. The multi-threaded version splits rows across threads and uses join() to wait for all of them.

MatrixMultiplication.javajavac + java MatrixMultiplication
public class MatrixMultiplication {
    static final int N = 1000;          // 1000 x 1000

    // --- Single-threaded baseline ---
    static int[][] multiplySingle(int[][] a, int[][] b) {
        int[][] c = new int[N][N];
        for (int i = 0; i < N; i++)
            computeRow(a, b, c, i);
        return c;
    }

    // --- Multi-threaded: split rows across `threads` workers ---
    static int[][] multiplyMulti(int[][] a, int[][] b, int threads)
            throws InterruptedException {
        int[][] c = new int[N][N];
        Thread[] pool = new Thread[threads];
        int chunk = (N + threads - 1) / threads;   // rows per thread

        for (int t = 0; t < threads; t++) {
            final int start = t * chunk;
            final int end = Math.min(start + chunk, N);
            pool[t] = new Thread(() -> {
                for (int i = start; i < end; i++)
                    computeRow(a, b, c, i);
            });
            pool[t].start();
        }
        for (Thread th : pool) th.join();   // WAIT for all rows
        return c;
    }

    static void computeRow(int[][] a, int[][] b, int[][] c, int i) {
        for (int j = 0; j < N; j++) {
            int sum = 0;
            for (int k = 0; k < N; k++) sum += a[i][k] * b[k][j];
            c[i][j] = sum;
        }
    }

    public static void main(String[] args) throws InterruptedException {
        int[][] a = randomMatrix(), b = randomMatrix();

        long t0 = System.currentTimeMillis();
        multiplySingle(a, b);
        System.out.println("Single-thread latency: " +
                (System.currentTimeMillis() - t0) + " ms");

        int cores = Runtime.getRuntime().availableProcessors();
        long t1 = System.currentTimeMillis();
        multiplyMulti(a, b, cores);
        System.out.println("Multi-thread latency (" + cores + " threads): " +
                (System.currentTimeMillis() - t1) + " ms");
    }

    static int[][] randomMatrix() {
        int[][] m = new int[N][N];
        for (int i = 0; i < N; i++)
            for (int j = 0; j < N; j++)
                m[i][j] = (int) (Math.random() * 10);
        return m;
    }
}
Sample output (numbers depend on your CPU)
Single-thread latency: 4120 ms
Multi-thread latency (8 threads): 760 ms

Each row is independent (no shared writes to the same cell), so this is "embarrassingly parallel" — the multi-thread version should be markedly faster on a multi-core machine.

E. Activity #2 — Image Inverter (cost of parallelization)

Inverts an RGB image by splitting it into horizontal bands, one thread per band. Swap FILE_SLUG (image size) and numOfThreads to observe where extra threads stop helping.

ImageInverter.javajavac + java ImageInverter
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;

public class ImageInverter {
    // quality levels from the deck: ULTRA_LOW, VERY_LOW, LOW, MEDIUM, HIGH
    static final String FILE_SLUG = "MEDIUM";     // <- tinker
    static final int numOfThreads = 4;            // <- tinker

    static void invertBand(BufferedImage img, int yStart, int yEnd) {
        for (int y = yStart; y < yEnd; y++) {
            for (int x = 0; x < img.getWidth(); x++) {
                int rgb = img.getRGB(x, y);
                int a = (rgb >> 24) & 0xff;
                int r = 255 - ((rgb >> 16) & 0xff);
                int g = 255 - ((rgb >> 8)  & 0xff);
                int bl = 255 - (rgb & 0xff);
                img.setRGB(x, y, (a << 24) | (r << 16) | (g << 8) | bl);
            }
        }
    }

    public static void main(String[] args) throws Exception {
        BufferedImage img = ImageIO.read(new File("input_" + FILE_SLUG + ".png"));
        int h = img.getHeight();
        int band = (h + numOfThreads - 1) / numOfThreads;

        Thread[] pool = new Thread[numOfThreads];
        long t0 = System.currentTimeMillis();

        for (int t = 0; t < numOfThreads; t++) {
            final int yStart = t * band;
            final int yEnd = Math.min(yStart + band, h);
            pool[t] = new Thread(() -> invertBand(img, yStart, yEnd));
            pool[t].start();
        }
        for (Thread th : pool) th.join();   // aggregate: wait for all bands

        System.out.println("Inverted " + FILE_SLUG + " with " + numOfThreads +
                " threads in " + (System.currentTimeMillis() - t0) + " ms");
        System.out.println("Cores available: " +
                Runtime.getRuntime().availableProcessors());
        ImageIO.write(img, "png", new File("output_" + FILE_SLUG + ".png"));
    }
}
Sample output
Inverted MEDIUM with 4 threads in 38 ms
Cores available: 8
Experiment: run the same image with numOfThreads = 1, 2, 4, 8, 16, 32. Latency drops until you hit roughly your core count, then flattens or worsens. On a small image (ULTRA_LOW), 1 thread can beat 16 because thread-creation overhead dominates the tiny job.

IV. Hands-On Lab — Practice

Do these in order. Each exercise has a goal, steps, and a collapsible solution — try first, then check. Everything runs with plain javac / java; no libraries except Exercise 6.

Setup (one time)

  1. Confirm Java is installed: run java -version and javac -version (need JDK 8+).
  2. Make a folder: mkdir threads-lab && cd threads-lab
  3. For each exercise, create FileName.java, then compile + run:
    javac FileName.java  then  java ClassName
  4. Find your core count first — you'll need it: System.out.println(Runtime.getRuntime().availableProcessors());

Warm-up Exercise 1 — Your first two threads

Goal: Launch two threads that each print their name 5 times, and prove the output interleaves.

  1. Write a class Greeter that implements Runnable; in run() loop 5 times printing name + " says hi #" + i.
  2. In main, create two Thread objects wrapping two Greeters ("Alice", "Bob") and start() both.
  3. Run it 3 times. Note how the order changes.
  4. Predict then verify: what happens if you call run() directly instead of start()?
Show solution
class Greeter implements Runnable {
    private final String name;
    Greeter(String n) { name = n; }
    public void run() {
        for (int i = 1; i <= 5; i++)
            System.out.println(name + " says hi #" + i);
    }
}
public class Ex1 {
    public static void main(String[] a) {
        new Thread(new Greeter("Alice")).start();
        new Thread(new Greeter("Bob")).start();
    }
}

Answer to the prediction: calling run() directly runs the code on the main thread — no new thread, no interleaving. You'd see all of Alice then all of Bob, in order. Only start() spawns a thread.

Warm-up Exercise 2 — Make main wait with join()

Goal: Print "DONE" only after a worker thread truly finishes — first without join() (buggy), then fix it.

  1. Create a worker that sleeps 1 second then prints "work complete".
  2. In main: start it, immediately print "DONE". Observe that "DONE" prints before "work complete".
  3. Now add worker.join() before printing "DONE" and re-run.
Show solution
public class Ex2 {
    public static void main(String[] a) throws InterruptedException {
        Thread worker = new Thread(() -> {
            try { Thread.sleep(1000); } catch (InterruptedException e) {}
            System.out.println("work complete");
        });
        worker.start();
        worker.join();             // remove this line to see the bug
        System.out.println("DONE");
    }
}

Without join(), main races ahead and prints "DONE" first. With it, main blocks until the worker's run() ends.

Core Exercise 3 — Watch the life-cycle states

Goal: Print a thread's getState() at NEW, RUNNABLE, TIMED_WAITING, and TERMINATED.

  1. Create a worker whose run() sleeps 500ms.
  2. Print state right after construction (expect NEW).
  3. start(), then print state (expect RUNNABLE).
  4. From main, sleep 100ms, then print the worker's state while it sleeps (expect TIMED_WAITING).
  5. join(), then print state (expect TERMINATED).
Show solution
public class Ex3 {
    public static void main(String[] a) throws InterruptedException {
        Thread w = new Thread(() -> {
            try { Thread.sleep(500); } catch (InterruptedException e) {}
        });
        System.out.println("NEW?       " + w.getState());
        w.start();
        System.out.println("RUNNABLE?  " + w.getState());
        Thread.sleep(100);
        System.out.println("TIMED?     " + w.getState());
        w.join();
        System.out.println("TERMINATED?" + w.getState());
    }
}

Core Exercise 4 — Cooperative interrupt

Goal: Write a counter thread that runs forever until interrupted, then stop it cleanly from main after 1 second.

  1. Worker: loop while (!Thread.currentThread().isInterrupted()), incrementing a counter.
  2. When the loop exits, print the final count.
  3. Main: start it, sleep 1000ms, call interrupt(), then join().
  4. Bonus: add a version using sleep() inside the loop and catch InterruptedException instead — compare the two stop mechanisms (C1 vs C2 above).
Show solution
public class Ex4 {
    public static void main(String[] a) throws InterruptedException {
        Thread w = new Thread(() -> {
            long n = 0;
            while (!Thread.currentThread().isInterrupted()) n++;
            System.out.println("Stopped. counted to " + n);
        });
        w.start();
        Thread.sleep(1000);
        w.interrupt();
        w.join();
    }
}

Pitfall to notice: if you swallow InterruptedException in a sleep()-based loop without re-checking the flag, you can accidentally make a thread un-interruptible. Always either exit or call Thread.currentThread().interrupt() to restore the flag.

Challenge Exercise 5 — Parallel sum & the speed-up curve

Goal: Sum a large array single-threaded, then split across N threads with join(), and find where adding threads stops helping.

  1. Build a long[] of 100 million elements filled with 1 (so the answer is known: 100,000,000).
  2. Time a single-threaded sum.
  3. Write sumParallel(arr, threads): give each thread a slice, each computes a partial sum into partials[t], join() all, then add the partials.
  4. Run for threads = 1, 2, 4, 8, 16. Plot (or just eyeball) latency vs threads. Confirm the peak is near availableProcessors().
  5. Think: why must each thread write to its own partials[t] instead of a shared total variable?
Show solution
public class Ex5 {
    static final int N = 100_000_000;

    static long sumParallel(long[] arr, int threads) throws InterruptedException {
        Thread[] pool = new Thread[threads];
        long[] partials = new long[threads];   // each thread owns one slot
        int chunk = (arr.length + threads - 1) / threads;
        for (int t = 0; t < threads; t++) {
            final int idx = t, s = t * chunk, e = Math.min(s + chunk, arr.length);
            pool[t] = new Thread(() -> {
                long sum = 0;
                for (int i = s; i < e; i++) sum += arr[i];
                partials[idx] = sum;                 // no shared write contention
            });
            pool[t].start();
        }
        for (Thread th : pool) th.join();
        long total = 0;
        for (long p : partials) total += p;  // aggregate AFTER join
        return total;
    }

    public static void main(String[] a) throws InterruptedException {
        long[] arr = new long[N];
        java.util.Arrays.fill(arr, 1);
        int cores = Runtime.getRuntime().availableProcessors();
        System.out.println("cores = " + cores);
        for (int th : new int[]{1,2,4,8,16}) {
            long t0 = System.currentTimeMillis();
            long sum = sumParallel(arr, th);
            System.out.println(th + " threads: " +
                (System.currentTimeMillis() - t0) + " ms (sum=" + sum + ")");
        }
    }
}

Answer: if every thread did total += partial on one shared variable, the unsynchronized read-modify-write would race and lose updates — a classic data race. Per-thread slots avoid contention entirely; you combine them safely only after every thread has joined.

Challenge Exercise 6 — Recreate the Image Inverter activity

Goal: Reproduce Activity #2 end-to-end and measure the latency vs. thread-count curve on real images.

  1. Grab a few PNGs at different sizes; name them input_ULTRA_LOW.pnginput_HIGH.png.
  2. Use the ImageInverter.java from Code Samples → E.
  3. For FILE_SLUG = "HIGH", run with numOfThreads = 1, 2, 4, 8, 16, 32 and record each latency.
  4. Repeat for FILE_SLUG = "ULTRA_LOW".
  5. Open output_*.png to confirm colors are inverted.
What you should observe
  • HIGH: latency falls as threads rise, bottoms out around your core count, then flattens/worsens past it (context-switch cost).
  • ULTRA_LOW: 1 thread often wins — the job is so small that creating extra threads costs more than it saves.
  • Rule confirmed: best throughput at numOfThreads ≈ Runtime.getRuntime().availableProcessors().

Stretch goal: replace the raw Thread[] with an ExecutorService thread pool (Executors.newFixedThreadPool(cores)) and compare — pools reuse threads instead of recreating them per run.

How to know you've got it: you can (1) explain why Ex1's output interleaves, (2) state exactly what Ex2's join() guarantees, (3) name which state each pause in Ex3 produces, (4) describe why interrupt() in Ex4 is cooperative, and (5) explain the per-thread-partials trick in Ex5 in terms of race conditions.