Format Conversion Modules Consistency Audit - mrrc-aw5.3¶
Date: 2025-12-28
Files Reviewed: json.rs, xml.rs, marcjson.rs, csv.rs, dublin_core.rs, mods.rs
Total Lines: 2266 lines across 6 modules
Overview¶
Format conversion modules provide serialization/deserialization to/from various metadata formats: - JSON: Generic JSON representation - XML/MARCXML: Standard MARC XML format - MARCJSON: JSON-LD standard format - CSV: Tabular export for spreadsheets - Dublin Core: Simplified 15-element schema - MODS: Rich metadata schema from Library of Congress
1. API Naming Consistency¶
Current API Surface¶
| Module | Export Functions | Pattern | Accepts |
|---|---|---|---|
| json | record_to_json() |
record_to_FORMAT |
Single Record |
json_to_record() |
FORMAT_to_record |
- | |
| xml | record_to_xml() |
record_to_FORMAT |
Single Record |
xml_to_record() |
FORMAT_to_record |
- | |
| marcjson | record_to_marcjson() |
record_to_FORMAT |
Single Record |
marcjson_to_record() |
FORMAT_to_record |
- | |
| csv | records_to_csv() |
record(s)_to_FORMAT |
PLURAL Slice |
| dublin_core | record_to_dublin_core() |
record_to_FORMAT |
Single Record |
dublin_core_to_xml() |
FORMAT_to_FORMAT |
- | |
| mods | record_to_mods_xml() |
record_to_FORMAT_xml |
Single Record |
Issues Identified¶
Issue #1: CSV Uses Plural Pattern
csv::records_to_csv(&[Record]) // Plural (batch)
json::record_to_json(&Record) // Singular
xml::record_to_xml(&Record) // Singular
marcjson::record_to_marcjson(&Record) // Singular
dublin_core::record_to_dublin_core(&Record) // Singular
mods::record_to_mods_xml(&Record) // Singular
Status: ⚠️ INCONSISTENT
Rationale: CSV is semantically batch-oriented (tabular), but breaks naming symmetry
Recommendation: Keep as-is (semantic correctness) but document the exception, OR add record_to_csv() single-record variant with delegation
Issue #2: MODS Uses _xml Suffix
mods::record_to_mods_xml(&Record) // Includes _xml suffix
dublin_core::record_to_dublin_core() // No suffix (but also outputs XML)
Status: ⚠️ INCONSISTENT
Rationale: Dublin Core also outputs XML but doesn't include suffix in function name
Recommendation: Standardize - either:
- Option A: Rename record_to_dublin_core() → record_to_dublin_core_xml() for symmetry
- Option B: Rename record_to_mods_xml() → record_to_mods() to match others
- Current: Acceptable as-is (MODS has _xml suffix to prevent confusion, as it's more technical)
Issue #3: Dublin Core Two-Step Pattern
// Dublin Core returns intermediate type, requires second call
let dc: DublinCoreRecord = dublin_core::record_to_dublin_core(&record)?;
let xml: String = dublin_core::dublin_core_to_xml(&dc)?;
// Others return final format directly
let json: Value = json::record_to_json(&record)?;
let xml: String = xml::record_to_xml(&record)?;
let csv: String = csv::records_to_csv(&[record])?;
Status: ⚠️ INCONSISTENT PATTERN
Rationale: DC returns intermediate struct, allowing partial processing
Recommendation: Add convenience function record_to_dublin_core_xml() for consistency, keep existing for flexibility
2. Error Handling Consistency¶
Error Handling Patterns¶
All modules follow consistent pattern:
Type Analysis:
- json.rs: Returns Result<Value> (serde_json::Value)
- xml.rs: Returns Result<String> (raw XML)
- marcjson.rs: Returns Result<Value> (serde_json::Value)
- csv.rs: Returns Result<String> (raw CSV)
- dublin_core.rs: Returns Result<DublinCoreRecord> (intermediate struct)
- mods.rs: Returns Result<String> (raw XML)
Return Type Consistency:
| Format | Direct XML | Uses Serde | Intermediate Type | Status |
|---|---|---|---|---|
| JSON | - | ✓ serde_json::Value | - | ✓ Consistent |
| XML | ✓ String | - | - | ✓ Consistent |
| MARCJSON | - | ✓ serde_json::Value | - | ✓ Consistent |
| CSV | ✓ String | - | - | ✓ Consistent |
| Dublin Core | ✓ String (second fn) | - | ✓ DublinCoreRecord | ⚠️ Two-step |
| MODS | ✓ String | - | - | ✓ Consistent |
Status: ✓ GOOD - Mostly consistent, Dublin Core is intentional design
Recommendation: Document the Dublin Core pattern as intentional (allows filtering/modification before serialization)
3. Code Organization Patterns¶
String Building Approaches¶
XML Modules (xml.rs, mods.rs, dublin_core.rs output):
- Use String with write!() macro
- Manual tag construction
- Proper XML escaping
Example (mods.rs):
let mut xml = String::from("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n");
write!(xml, "<mods>...</mods>")?;
JSON Modules (json.rs, marcjson.rs):
- Use serde_json json!() macro
- Type-safe via serde
- Automatic escaping
Example (json.rs):
CSV Module (csv.rs):
- Use String with writeln!() macro
- Custom escape function for CSV values
Status: ✓ GOOD - Each uses appropriate tool for format
Recommendation: KEEP - serde for structured formats, String for streaming
4. Field Mapping Consistency¶
MARC Field Crosswalks¶
All modules document their MARC field mappings in module-level doc comments:
| Module | Mappings Documented | Approach | Quality |
|---|---|---|---|
| JSON | ✓ Yes | Generic structure | Good |
| XML | ✓ Yes | MARCXML standard | Good |
| MARCJSON | ✓ Yes | Standard format | Good |
| CSV | ✓ Yes | Tabular columns | Good |
| Dublin Core | ✓ Yes | 15-element mapping | Excellent |
| MODS | ✓ Yes | Library of Congress | Excellent |
Status: ✓ EXCELLENT - All document their mappings clearly
Recommendation: NONE
5. Helper Function Organization¶
Code Organization by Module¶
json.rs (288 lines): - 2 main functions (to/from) - No helper functions - Straightforward implementation
xml.rs (306 lines): - 2 main functions (to/from) - 3 helper types (MarcXmlRecord, MarcXmlControlField, MarcXmlDataField) - Uses serde for parsing
marcjson.rs (295 lines): - 2 main functions (to/from) - 2 helper functions (leader parsing) - Clean, focused
csv.rs (287 lines): - 2 main functions (batch to, filtered to) - 2 helper functions (escape_csv_value, etc.) - Good separation of concerns
dublin_core.rs (494 lines): - 1 struct (DublinCoreRecord - 15 fields) - 2 main functions (to DC, DC to XML) - 15+ internal helper functions for MARC→DC mapping - Most complex due to semantic mapping
mods.rs (596 lines): - 1 main function (record_to_mods_xml) - 15+ internal helper functions (write_titles, write_names, etc.) - Most complex due to rich metadata schema
Status: ✓ GOOD - Helper functions properly hidden, main functions clean
Recommendation: NONE
6. Round-Trip Conversion Support¶
| Format | to_FORMAT | FORMAT_to_record | Round-trip | Status |
|---|---|---|---|---|
| JSON | ✓ | ✓ | ✓ Yes | Complete |
| XML | ✓ | ✓ | ✓ Yes | Complete |
| MARCJSON | ✓ | ✓ | ✓ Yes | Complete |
| CSV | ✓ (→) | ✗ | ✗ No | Export-only |
| Dublin Core | ✓ | ✗ | ✗ No | Export-only |
| MODS | ✓ | ✗ | ✗ No | Export-only |
Status: ✓ GOOD - CSV, Dublin Core, MODS are intentionally export-only (lossy formats)
Recommendation: Document in module rustdoc that these are one-way conversions
7. Testing Coverage¶
Test Organization¶
- json.rs: 6 tests (serialize, deserialize, roundtrip)
- xml.rs: 8 tests (serialize, deserialize, roundtrip, element checking)
- marcjson.rs: 6 tests (serialize, deserialize, roundtrip)
- csv.rs: 5 tests (single field, multiple records, escaping)
- dublin_core.rs: 10 tests (field mapping verification)
- mods.rs: 8 tests (field mapping verification)
Status: ✓ GOOD - Reasonable coverage for each format
Recommendation: Consider adding integration tests for complete workflows (roundtrip with real data)
Summary of Inconsistencies¶
Critical (Breaking Changes):¶
None identified.
Medium Priority (Recommend standardization):¶
| Issue | Location | Impact | Recommendation |
|---|---|---|---|
| CSV plural naming | csv.rs | API symmetry | Acceptable - semantic correctness (batch). Document exception. |
| MODS _xml suffix | mods.rs | Naming symmetry | Acceptable - prevents confusion. Consider for Dublin Core too. |
| Dublin Core two-step | dublin_core.rs | API simplicity | Add record_to_dublin_core_xml() convenience function. |
Low Priority (Documentation):¶
- Document that CSV, Dublin Core, MODS are one-way (export-only)
- Clarify batch vs single-record semantics for CSV
Recommendations¶
Immediate (Next Review Cycle)¶
- Add Convenience Function:
dublin_core::record_to_dublin_core_xml() - Combines
record_to_dublin_core()anddublin_core_to_xml() -
Improves API consistency with other formats
-
Document CSV Batch Pattern: Add note in csv.rs module doc explaining plural naming is intentional
-
CSV Single-Record Variant (Optional):
record_to_csv(&Record) -> Result<String> - Would improve consistency with other modules
- Could delegate to
records_to_csv(&[record])
Long-term (Future Enhancement)¶
- Consider trait-based design for format converters (e.g.,
Serializabletrait) - Add reverse conversions for CSV (lossy but useful)
- Consider generic serialization builder
Conclusion¶
Overall Assessment: Format conversion modules are WELL-DESIGNED and CONSISTENT
- ✓ Error handling is uniform
- ✓ Documentation is excellent
- ✓ Field mappings are clear
- ✓ Helper functions properly organized
- ⚠️ Minor naming inconsistencies (acceptable with documentation)
Action Items:
1. Add record_to_dublin_core_xml() convenience function (low effort, improves consistency)
2. Document CSV batch pattern in module rustdoc (documentation only)
3. Consider adding CSV single-record variant (optional, low priority)
Audit Status: Ready for closure