Skip to content

Week 3 - Collections, Streams, and java.time

Conceptual Core

The collections framework is older than most engineers reading this. The modern stance:

  • List.of / Map.of for immutable literals.
  • Stream for transformations under ~1M elements; readable, JIT-friendly past warmup.
  • plain for loops for hot paths where allocation matters.
  • never Vector / Hashtable / Stack (synchronized, legacy).

java.time (JSR-310) has been the right answer since Java 8 (2014). Date/Calendar exist only to torment you in legacy code - convert at the boundary.

Mechanical Detail

  • Collection hierarchy: List (ordered, indexed), Set (unique), Map (key→value), Deque (double-ended). SequencedCollection (JEP 431, Java 21+) adds reversed(), getFirst(), getLast() to the relevant subtypes - a uniform endpoint API.
  • Pick by access pattern: lookup by key → HashMap; insertion-ordered iteration → LinkedHashMap; sorted iteration → TreeMap; cheap append/iterate → ArrayList; cheap deque ops → ArrayDeque (not LinkedList, which is slower in practice for almost everything).
  • Immutable factories: List.of(a, b, c), Map.of(k1, v1, k2, v2), Map.ofEntries(Map.entry(k, v), ...). They reject null and throw UnsupportedOperationException on mutation - by design.
  • Stream cost model: lazy until terminal (collect, forEach, count); auto-boxes primitives unless you use IntStream / LongStream / DoubleStream (which expose .sum(), .average(), .summaryStatistics() without boxing). Stream.toList() (16+) returns unmodifiable; Collectors.toList() returns mutable.
  • parallelStream is a footgun in 99% of cases - uses the shared ForkJoinPool.commonPool(); one slow task blocks every other parallel stream in the JVM. Default to virtual-thread executors (Month 4) instead.
  • java.time essentials: Instant (machine time, UTC), Duration (machine-scale gap), LocalDate/LocalTime/LocalDateTime (no zone), ZonedDateTime (with zone), OffsetDateTime (with offset only), Period (human-scale gap, like "3 months"). At the boundary with legacy APIs, convert: Date.from(instant) / instant.atZone(...).

The trap

Collectors.toMap(keyFn, valueFn) throws IllegalStateException on duplicate keys. Always provide the merge function: Collectors.toMap(keyFn, valueFn, (a, b) -> a) to keep first, (a, b) -> b to keep last.

Lab

Take a CSV file of timestamped events. Compute per-hour aggregates two ways: 1. Stream + Collectors.groupingBy(e -> e.timestamp().truncatedTo(HOURS), Collectors.counting()). 2. Explicit for loop + HashMap<Instant, Long>.

JMH them in Week 8. Also measure peak memory (-Xlog:gc*=info + young-gen allocation rate).

Idiomatic Drill

Find any code using SimpleDateFormat. Replace with DateTimeFormatter. Explain why the old one isn't thread-safe (mutable Calendar field, parsed state held internally). Audit any static SimpleDateFormat field for concurrent-use bugs.

Production Hardening Slice

Read every method on Optional. Then read Brian Goetz's rule: "Optional is for return types, not parameters and not fields." Internalize.

Related: Stream.findFirst() is deterministic; Stream.findAny() lets the parallel implementation short-circuit. Use the one that matches your intent.

Comments