logo_smallAxellero.io

Read File

Read and access file contents in the sandbox environment with support for multiple formats, encoding detection, and streaming capabilities.

Read File

Read and access file contents within the sandbox environment with comprehensive format support, automatic encoding detection, and streaming capabilities for large files.

📖 File Reading Capabilities

File reading supports text, binary, structured data formats with automatic encoding detection, format parsing, and memory-efficient streaming for large files.

Overview

The Read File tool enables comprehensive file content access within the sandbox environment, supporting multiple file formats, encoding detection, and optimized reading strategies for different file types and sizes.

Key Features

  • Multi-Format Support - Read text, binary, JSON, CSV, XML, and structured data
  • Encoding Detection - Automatic character encoding detection and conversion
  • Streaming Support - Memory-efficient reading for large files
  • Format Parsing - Built-in parsing for common data formats
  • Range Reading - Read specific portions or byte ranges of files

Methods

readFile

Read file contents from the sandbox environment.

ParameterTypeRequiredDescription
filePathStringYesPath to the file in sandbox environment
readModeStringNoRead mode: 'text', 'binary', 'json', 'csv', 'xml' (default: 'auto')
encodingStringNoCharacter encoding (default: 'auto-detect')
startByteNumberNoStarting byte position for range reading
endByteNumberNoEnding byte position for range reading
maxSizeNumberNoMaximum file size to read in bytes (default: 50MB)
parseOptionsObjectNoFormat-specific parsing options
{
  "filePath": "/sandbox/data/dataset.csv",
  "readMode": "csv",
  "encoding": "utf-8",
  "parseOptions": {
    "delimiter": ",",
    "header": true
  }
}

Output:

  • success (Boolean) - Read operation success status
  • content (String/Object/Array) - File content (format depends on readMode)
  • metadata (Object) - File metadata information
    • size (Number) - File size in bytes
    • encoding (String) - Detected or specified encoding
    • mimeType (String) - File MIME type
    • lastModified (String) - Last modification timestamp
  • readInfo (Object) - Reading operation details
    • bytesRead (Number) - Number of bytes read
    • readTime (Number) - Read operation duration in milliseconds
    • format (String) - Detected file format
    • isPartial (Boolean) - Whether this is a partial read

Text File Reading

Plain Text Files

Structured Data Reading

JSON Files

Binary File Reading

Binary Data Handling

🔒 Binary File Limitations

Binary file reading is limited to specific formats and size restrictions for security. Large binary files should be processed in chunks.

def read_binary_file(file_path, max_size=10485760):  # 10MB default
    """Read binary file with size validation."""
    
    # Check file size first
    metadata_result = readFile({
        "filePath": file_path,
        "startByte": 0,
        "endByte": 1,
        "readMode": "binary"
    })
    
    if not metadata_result['success']:
        return None
    
    file_size = metadata_result['metadata']['size']
    
    if file_size > max_size:
        print(f"File too large: {file_size} bytes (max: {max_size})")
        return None
    
    # Read entire binary file
    result = readFile({
        "filePath": file_path,
        "readMode": "binary"
    })
    
    return result

# Usage for image files
image_data = read_binary_file("/sandbox/images/chart.png")
if image_data and image_data['success']:
    print(f"Read {len(image_data['content'])} bytes of image data")

File Format Detection

def detect_file_format(file_path):
    """Detect file format by reading file headers."""
    
    # Read first few bytes to detect format
    header_result = readFile({
        "filePath": file_path,
        "readMode": "binary",
        "startByte": 0,
        "endByte": 64  # Read first 64 bytes
    })
    
    if not header_result['success']:
        return None
    
    header_bytes = header_result['content']
    
    # File signature detection
    format_signatures = {
        'PDF': b'%PDF',
        'PNG': b'\x89PNG\r\n\x1a\n',
        'JPEG': b'\xff\xd8\xff',
        'GIF': b'GIF8',
        'ZIP': b'PK\x03\x04',
        'DOCX': b'PK\x03\x04',  # DOCX is ZIP-based
        'XLSX': b'PK\x03\x04'  # XLSX is ZIP-based
    }
    
    detected_format = None
    for format_name, signature in format_signatures.items():
        if header_bytes.startswith(signature):
            detected_format = format_name
            break
    
    return {
        "format": detected_format,
        "mime_type": header_result['metadata'].get('mimeType'),
        "file_size": header_result['metadata']['size']
    }

# Usage
format_info = detect_file_format("/sandbox/uploads/document.pdf")
print(f"Detected format: {format_info['format']}")

Range and Partial Reading

Byte Range Reading

def read_file_range(file_path, start_byte, chunk_size):
    """Read specific byte range from file."""
    
    end_byte = start_byte + chunk_size - 1
    
    result = readFile({
        "filePath": file_path,
        "readMode": "binary",
        "startByte": start_byte,
        "endByte": end_byte
    })
    
    if result['success']:
        return {
            "content": result['content'],
            "bytes_read": len(result['content']),
            "start_position": start_byte,
            "end_position": min(end_byte, result['metadata']['size'] - 1)
        }
    
    return None

def read_file_tail(file_path, tail_size=1024):
    """Read last N bytes of file (like 'tail' command)."""
    
    # Get file size first
    size_result = readFile({
        "filePath": file_path,
        "startByte": 0,
        "endByte": 1,
        "readMode": "text"
    })
    
    if not size_result['success']:
        return None
    
    file_size = size_result['metadata']['size']
    start_byte = max(0, file_size - tail_size)
    
    tail_result = readFile({
        "filePath": file_path,
        "readMode": "text",
        "startByte": start_byte
    })
    
    return tail_result['content'] if tail_result['success'] else None

# Usage examples
file_chunk = read_file_range("/sandbox/data/large_file.dat", 1000000, 65536)
log_tail = read_file_tail("/sandbox/logs/app.log", 2048)

Error Handling

Common Reading Issues

Error TypeCauseResolution
File Not FoundFile doesn't exist at specified pathVerify file path and existence
Permission DeniedInsufficient read permissionsCheck file permissions
Encoding ErrorCharacter encoding mismatchSpecify correct encoding or use auto-detect
File Too LargeFile exceeds size limitsUse range reading or streaming
Format ErrorInvalid file format for specified readModeUse 'auto' mode or correct format

Robust File Reading

def safe_file_read(file_path, **options):
    """Robust file reading with comprehensive error handling."""
    
    try:
        # First check if file exists and get basic info
        basic_result = readFile({
            "filePath": file_path,
            "startByte": 0,
            "endByte": 1,
            "readMode": "binary"
        })
        
        if not basic_result['success']:
            return {"error": "File not accessible", "path": file_path}
        
        file_size = basic_result['metadata']['size']
        
        # Check size limits
        max_size = options.get('maxSize', 50 * 1024 * 1024)  # 50MB default
        if file_size > max_size:
            return {
                "error": f"File too large: {file_size} bytes (max: {max_size})",
                "size": file_size
            }
        
        # Attempt to read with specified options
        result = readFile({
            "filePath": file_path,
            **options
        })
        
        if result['success']:
            return result
        else:
            # Try fallback reading modes
            fallback_modes = ['text', 'binary']
            for mode in fallback_modes:
                if options.get('readMode') != mode:
                    try:
                        fallback_result = readFile({
                            "filePath": file_path,
                            "readMode": mode
                        })
                        if fallback_result['success']:
                            fallback_result['warning'] = f"Used fallback mode: {mode}"
                            return fallback_result
                    except:
                        continue
            
            return {"error": "All reading attempts failed", "path": file_path}
    
    except Exception as e:
        return {"error": f"Unexpected error: {str(e)}", "path": file_path}

# Usage with error handling
safe_result = safe_file_read("/sandbox/data/problematic_file.txt", readMode="json")
if 'error' in safe_result:
    print(f"❌ Reading failed: {safe_result['error']}")
else:
    print(f"✅ Successfully read {len(str(safe_result['content']))} characters")

Integration Patterns

With Code Execution Tools

# Read configuration and execute code based on settings
def execute_with_config(config_file_path, code_template):
    """Read configuration and execute parameterized code."""
    
    # Read configuration file
    config_result = readFile({
        "filePath": config_file_path,
        "readMode": "json"
    })
    
    if not config_result['success']:
        return None
    
    config = config_result['content']
    
    # Generate code with configuration values
    parameterized_code = code_template.format(**config)
    
    # Execute the code
    exec_result = codeExecution({
        "language": "python",
        "code": parameterized_code
    })
    
    return {
        "config": config,
        "generated_code": parameterized_code,
        "execution_result": exec_result
    }

# Usage
template = """
database_host = "{db_host}"
database_port = {db_port}
print(f"Connecting to {{database_host}}:{{database_port}}")
"""

result = execute_with_config("/sandbox/config/db_config.json", template)

Next Steps: Combine with Write File for data processing workflows, or use File Search for content analysis.