Java 8 parallel forEach progress indication


Question

For performance reason I would like to use a forEach loop of a parallel Lambda stream in order to process an instance of a Collection in Java. As this runs in a background Service I would like to use the updateProgress(double,double) method in order to inform the user about the current progress.

In order to indicate the current progress I need a certain progress indicator in form of a Integer counter. However, this is not possible as I can only access final variables within the Lambda expression.

Code example see below, Collection is only a place holder for any possible instance of a Collection:

int progress = 0;
Collection.parallelStream().forEach(signer -> {
   progress++;
   updateProgress(progress, Collection.size());     
});

I'm aware that I can solve this problem by using a simple for-loop. However, for performance reason it would nice to solve it in this way.

Does anybody know a more or less neat solution to this?

1
3
11/7/2014 4:41:46 PM

Accepted Answer

As proposed by markspace, using an AtomicInteger is a good solution:

AtomicInteger progress = new AtomicInteger();
Collection.parallelStream().forEach(signer -> {
    progress.incrementAndGet();
    // do some other useful work
});

I would not use the runLater() variant as your goal is a high performance, and if many parallel threads will generte JavaFX 'runLater' tasks, you will again create a bottleneck...

For the same reason I would NOT call an update to the ProgressBar each time, but use a seaparte JavaFX Timeline to update the progress bar in regular intervals independently from the processing threads.

Here is a full code comparing sequential and parallel processing with ProgressBar. If you remove the sleep(1) and set the number of items to 10 million it will still work concurrently and efficiently...

public class ParallelProgress extends Application {
static class ParallelProgressBar extends ProgressBar {
    AtomicInteger myDoneCount = new AtomicInteger();
    int           myTotalCount;
    Timeline      myWhatcher = new Timeline(new KeyFrame(Duration.millis(10), e -> update()));

    public void update() {
        setProgress(1.0*myDoneCount.get()/myTotalCount);
        if (myDoneCount.get() >= myTotalCount) {
            myWhatcher.stop();
            myTotalCount = 0;
        }
    }

    public boolean isRunning() { return myTotalCount > 0; }

    public void start(int totalCount) {
        myDoneCount.set(0);
        myTotalCount = totalCount;
        setProgress(0.0);
        myWhatcher.setCycleCount(Timeline.INDEFINITE);
        myWhatcher.play();
    }

    public void add(int n) {
        myDoneCount.addAndGet(n);
    }
}

HBox testParallel(HBox box) {
    ArrayList<String> myTexts = new ArrayList<String>();

    for (int i = 1; i < 10000; i++) {
        myTexts.add("At "+System.nanoTime()+" ns");
    }

    Button runp = new Button("parallel");
    Button runs = new Button("sequential");
    ParallelProgressBar progress = new ParallelProgressBar();

    Label result = new Label("-");

    runp.setOnAction(e -> {
        if (progress.isRunning()) return;
        result.setText("...");
        progress.start(myTexts.size());

        new Thread() {
            public void run() {
                long ms = System.currentTimeMillis();
                myTexts.parallelStream().forEach(text -> {
                    progress.add(1);
                    try { Thread.sleep(1);} catch (Exception e1) { }
                });
                Platform.runLater(() -> result.setText(""+(System.currentTimeMillis()-ms)+" ms"));
            }
        }.start();
    });

    runs.setOnAction(e -> {
        if (progress.isRunning()) return;
        result.setText("...");
        progress.start(myTexts.size());
        new Thread() {
            public void run() {
                final long ms = System.currentTimeMillis();
                myTexts.forEach(text -> {
                    progress.add(1);
                    try { Thread.sleep(1);} catch (Exception e1) { }
                });
                Platform.runLater(() -> result.setText(""+(System.currentTimeMillis()-ms)+" ms"));
            }
        }.start();
    });

    box.getChildren().addAll(runp, runs, progress, result);
    return box;
}


@Override
public void start(Stage primaryStage) throws Exception {        
    primaryStage.setTitle("ProgressBar's");

    HBox box = new HBox();
    Scene scene = new Scene(box,400,80,Color.WHITE);
    primaryStage.setScene(scene);

    testParallel(box);

    primaryStage.show();   
}

public static void main(String[] args) { launch(args); }
}
7
8/4/2016 2:15:39 PM

The naive solution would be to have progress as a field of some surrounding object; then referring to progress from a lambda closure would actually mean this.progress, where this is final, thus the compiler would not complain. However, the resulting code would access the progress field from multiple threads concurrently, which could cause race conditions. I suggest restricting access to the progress field to the JavaFX application thread, by using Platform.runLater. The whole solution then looks like this:

// accessed only on JavaFX application thread
private int progress = 0;

// invoked only on the JavaFX application thread
private void increaseProgress() {
    progress++;
    updateProgress(progress, collection.size());
}

private void processCollection() {
    collection.parallelStream().forEach(signer -> {
        // do the work (on any thread)
        // ...

        // when done, update progress on the JavaFX thread
        Platfrom.runLater(this::increaseProgress);
    });
}

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon