Combine's .collect(.byTime)

Post mostly for programmers…

The biggest bug in Bike’s 1.0 release was excessive CPU use while in background. This bug is fixed now, but embarrassing since Bike is a macOS native app that isn’t supposed to do that sort of thing.

How did it happen?

First, I didn’t realize that each time you use .collect(.byTime) you are setting up a new repeating timer. I had expected a timer was involved, but thought it would only get scheduled when new items were coming into the publisher. Nope, it’s repeating once the pipeline is setup.

Second, I stress test Bike with one big outline, but this problem didn’t show up noticeably until you opened multiple documents. I was using .collect(.byTime) a few times per document. After the issue was reported I tried opening 30 documents and the CPU would sit at 50% usage when in the background. Ugh.

Why use .collect(.byTime) in the first place? Collect by time is really useful because it allows your model to change often, while not flooding your view with updates.

For example Bike schedules background spell checking when an item scrolls into view. When that work is finished the view needs to be updated. The problem is spell check results can arrive quickly and a separate view update for each new result will slow down scrolling.

Bike used .collect(.byTime) to collect those spelling results, but only update the view every 16ms or so. It’s a big performance win and makes scrolling fast while also showing spelling errors “immediately” as they scroll into view. Except for the background CPU usage it’s great.

I’ve posted a .collect(.byTime) replacement (for my use case) here. It solves the problem by only scheduling on the runloop once per group of incoming results. It appears to be working well and Bike now uses 0% CPU when in background even with 30 documents open.