Week 3 - Collections, Streams, and java.time¶
Conceptual Core¶
The collections framework is older than most engineers reading this. The modern stance:
List.of/Map.offor immutable literals.Streamfor transformations under ~1M elements; readable, JIT-friendly past warmup.- plain
forloops for hot paths where allocation matters. - never
Vector/Hashtable/Stack(synchronized, legacy).
java.time (JSR-310) has been the right answer since Java 8 (2014). Date/Calendar exist only to torment you in legacy code - convert at the boundary.
Mechanical Detail¶
- Collection hierarchy:
List(ordered, indexed),Set(unique),Map(key→value),Deque(double-ended).SequencedCollection(JEP 431, Java 21+) addsreversed(),getFirst(),getLast()to the relevant subtypes - a uniform endpoint API. - Pick by access pattern: lookup by key →
HashMap; insertion-ordered iteration →LinkedHashMap; sorted iteration →TreeMap; cheap append/iterate →ArrayList; cheap deque ops →ArrayDeque(notLinkedList, which is slower in practice for almost everything). - Immutable factories:
List.of(a, b, c),Map.of(k1, v1, k2, v2),Map.ofEntries(Map.entry(k, v), ...). They rejectnulland throwUnsupportedOperationExceptionon mutation - by design. Streamcost model: lazy until terminal (collect,forEach,count); auto-boxes primitives unless you useIntStream/LongStream/DoubleStream(which expose.sum(),.average(),.summaryStatistics()without boxing).Stream.toList()(16+) returns unmodifiable;Collectors.toList()returns mutable.parallelStreamis a footgun in 99% of cases - uses the sharedForkJoinPool.commonPool(); one slow task blocks every other parallel stream in the JVM. Default to virtual-thread executors (Month 4) instead.java.timeessentials:Instant(machine time, UTC),Duration(machine-scale gap),LocalDate/LocalTime/LocalDateTime(no zone),ZonedDateTime(with zone),OffsetDateTime(with offset only),Period(human-scale gap, like "3 months"). At the boundary with legacy APIs, convert:Date.from(instant)/instant.atZone(...).
The trap
Collectors.toMap(keyFn, valueFn) throws IllegalStateException on duplicate keys. Always provide the merge function: Collectors.toMap(keyFn, valueFn, (a, b) -> a) to keep first, (a, b) -> b to keep last.
Lab¶
Take a CSV file of timestamped events. Compute per-hour aggregates two ways:
1. Stream + Collectors.groupingBy(e -> e.timestamp().truncatedTo(HOURS), Collectors.counting()).
2. Explicit for loop + HashMap<Instant, Long>.
JMH them in Week 8. Also measure peak memory (-Xlog:gc*=info + young-gen allocation rate).
Idiomatic Drill¶
Find any code using SimpleDateFormat. Replace with DateTimeFormatter. Explain why the old one isn't thread-safe (mutable Calendar field, parsed state held internally). Audit any static SimpleDateFormat field for concurrent-use bugs.
Production Hardening Slice¶
Read every method on Optional. Then read Brian Goetz's rule: "Optional is for return types, not parameters and not fields." Internalize.
Related: Stream.findFirst() is deterministic; Stream.findAny() lets the parallel implementation short-circuit. Use the one that matches your intent.