HDDS-14524. Add freon test that uses hfs API for both writes and reads with data validation by chihsuan · Pull Request #10651 · apache/ozone

chihsuan · 2026-07-02T15:06:17Z

What changes were proposed in this pull request?

This PR adds a new Freon workload, dfsrw (dfs-read-write-validator), that exercises a Hadoop-compatible file system (o3fs:// / ofs://) with a mixed read-write workload and per-file data validation.

Each worker thread:

writes a file whose content carries a distinct per-write marker
keeps the latest hash of every path it wrote
reads back a random path it previously wrote and verifies the hash still matches

A digest mismatch fails the run, so the tool surfaces both data corruption (bytes changed) and stale reads (an overwritten path returning older bytes) under concurrent load.

Design / robustness notes

Streaming digest - the write-side hash is computed on the fly via DigestOutputStream, so large files are never materialized in memory and file size is not capped at Integer.MAX_VALUE.
Exact size - the per-write marker is budgeted within --size, so a generated file is exactly --size bytes.
Meaningful under --duration - each write gets a distinct marker, so when a path is reused (as it is in time-based runs) its content and hash change; the read-back validates against the most recent write, which is what makes stale reads detectable.
Per-thread path namespace - the thread sequence id is part of each path, so one worker never overwrites a file another worker is reading back. Each thread tracks its paths in a per-path map (an overwrite just updates the digest), which stays naturally bounded with a hard cap for very large runs.
Input bounds - --size must be at least 8 bytes (the width of the per-write marker) and --buffer / --copy-buffer must be positive; these are functional minimums, not arbitrary caps.

The workload extends HadoopBaseFreonGenerator (thread-local FileSystem with proper close handling) and reuses ContentGenerator, getDigest, and the runTests task runner.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14524

How was this patch tested?

New integration test TestHadoopFsReadWriteValidator, added as a @Nested case to the existing FreonTests suite (shared MiniOzoneCluster), parameterized over FILE_SYSTEM_OPTIMIZED and LEGACY bucket layouts. It runs dfsrw end to end and independently verifies the expected files were written with the requested size.
Ran locally against a MiniOzoneCluster: FreonTests$HadoopFsReadWriteValidator 2/2 passing; full TestFreon green (31 tests, 0 failures).
mvn checkstyle:check clean on the changed modules.

Generated-by: Claude Code (Claude Opus 4.8)

…s with data validation Add a Freon workload (dfsrw) that uses the Hadoop FS API to write files with per-file content, keep each file's hash, then read back a random file previously written by the thread and validate the hash matches. Covered by a new integration test running against FSO and LEGACY bucket layouts, wired into the existing FreonTests suite.

chihsuan force-pushed the HDDS-14524 branch from 4cee131 to 3b3bb84 Compare July 2, 2026 15:27

chihsuan marked this pull request as ready for review July 3, 2026 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-14524. Add freon test that uses hfs API for both writes and reads with data validation#10651

HDDS-14524. Add freon test that uses hfs API for both writes and reads with data validation#10651
chihsuan wants to merge 1 commit into
apache:masterfrom
chihsuan:HDDS-14524

chihsuan commented Jul 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

chihsuan commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chihsuan commented Jul 2, 2026 •

edited

Loading