Upload files and/or paste URLs. The system will extract and clean the text, then create:
- a eve_corpus-*.txt file (plain text corpus)
- a eve_docs-*.jsonl file (one JSON per line: {source_type, source, text, created_at, chunk_index})
datasets/: