Add TOON support to ZIO Schema 2

This ticket is for adding TOON support for ZIO Schema 2, as a new Format, with associated codec, deriver, test, and documentation.

**NOTE**: What follows is an AI-generated description of the problem and sketch of solution--it may be useful, but it certainly contains errors, and if you don't know enough to find and fix those errors, you shouldn't attempt to complete this ticket.

------------------------

<html><head></head><body><h1>TOON Format Implementation Guide for ZIO Schema 2</h1>
<h2>Executive Summary</h2>
<p>This guide provides a complete specification for implementing TOON (Token-Oriented Object Notation) codec support in ZIO Schema 2 (zio-blocks). TOON is a compact, human-readable serialization format designed to minimize token usage when passing structured data to Large Language Models, achieving 30-60% token reduction compared to JSON while maintaining lossless bidirectional conversion.</p>
<p>The implementation will follow the established patterns in zio-blocks, mirroring the architecture of <code>JsonBinaryCodecDeriver</code> while adding TOON-specific capabilities for array format selection and indentation-based structure.</p>
<hr>
<h2>Part 1: TOON Format Specification</h2>
<h3>1.1 Overview</h3>
<p>TOON was created by Johann Schopplich in 2025 to address the inefficiency of JSON when used in LLM prompts. The format combines YAML-style indentation with CSV-style tabular data representation. The specification is maintained at <a href="https://github.com/toon-format/spec">github.com/toon-format/spec</a>, currently at version 3.0.</p>
<p><strong>Design goals:</strong></p>
<ul>
<li>Minimize token count for LLM context windows</li>
<li>Maintain human readability</li>
<li>Enable lossless JSON↔TOON conversion</li>
<li>Schema-aware encoding for maximum compression</li>
</ul>
<h3>1.2 Data Types</h3>
<p>TOON supports the complete JSON data model:</p>

Type | TOON Representation | Example
-- | -- | --
String | Unquoted (default) or quoted | hello or "hello, world"
Number | Decimal form only (no scientific notation) | 42, 3.14159
Boolean | Lowercase keywords | true, false
Null | Keyword | null
Array | Three formats (see §1.4) | items[3]: a,b,c
Object | Indentation-based nesting | See §1.3

</body></html># TOON Format Implementation Guide for ZIO Schema 2

## Executive Summary

This guide provides a complete specification for implementing TOON (Token-Oriented Object Notation) codec support in ZIO Schema 2 (zio-blocks). TOON is a compact, human-readable serialization format designed to minimize token usage when passing structured data to Large Language Models, achieving 30-60% token reduction compared to JSON while maintaining lossless bidirectional conversion.

The implementation will follow the established patterns in zio-blocks, mirroring the architecture of `JsonBinaryCodecDeriver` while adding TOON-specific capabilities for array format selection and indentation-based structure.

---

## Part 1: TOON Format Specification

### 1.1 Overview

TOON was created by Johann Schopplich in 2025 to address the inefficiency of JSON when used in LLM prompts. The format combines YAML-style indentation with CSV-style tabular data representation. The specification is maintained at [[github.com/toon-format/spec](https://github.com/toon-format/spec)](https://github.com/toon-format/spec), currently at version 3.0.

**Design goals:**
- Minimize token count for LLM context windows
- Maintain human readability
- Enable lossless JSON↔TOON conversion
- Schema-aware encoding for maximum compression

### 1.2 Data Types

TOON supports the complete JSON data model:

| Type | TOON Representation | Example |
|------|---------------------|---------|
| String | Unquoted (default) or quoted | `hello` or `"hello, world"` |
| Number | Decimal form only (no scientific notation) | `42`, `3.14159` |
| Boolean | Lowercase keywords | `true`, `false` |
| Null | Keyword | `null` |
| Array | Three formats (see §1.4) | `items[3]: a,b,c` |
| Object | Indentation-based nesting | See §1.3 |

### 1.3 Object Encoding

Objects use indentation (2 spaces default) with colon-separated key-value pairs:

```toon
name: Alice
age: 30
address:
  street: 123 Main St
  city: Springfield
```

Equivalent JSON:
```json
{"name":"Alice","age":30,"address":{"street":"123 Main St","city":"Springfield"}}
```

**Key rules:**
- Keys are unquoted unless they contain special characters
- Values on the same line as keys (primitives) or indented below (nested structures)
- Empty objects: just the key with colon and nothing following

### 1.4 Array Encoding Formats

TOON's primary innovation is intelligent array encoding. The format supports three array representations:

#### Tabular Format (Maximum Compression)

For arrays of uniform objects where all elements share identical keys with only primitive values:

```toon
users[3]{id,name,email}:
  1,Alice,alice@example.com
  2,Bob,bob@example.com
  3,Carol,carol@example.com
```

Equivalent JSON:
```json
{"users":[{"id":1,"name":"Alice","email":"alice@example.com"},{"id":2,"name":"Bob","email":"bob@example.com"},{"id":3,"name":"Carol","email":"carol@example.com"}]}
```

**Tabular eligibility requirements:**
1. All elements must be objects
2. All objects must have identical keys in the same order
3. All field values must be primitives (not nested objects or arrays)

#### Inline Format (Primitive Arrays)

For arrays containing only primitive values:

```toon
tags[4]: javascript,react,typescript,node
numbers[5]: 1,2,3,4,5
```

#### List Format (Heterogeneous Data)

For arrays with mixed types, nested structures, or non-uniform objects:

```toon
items[3]:
  - name: Widget
    price: 9.99
  - name: Gadget
    price: 19.99
  - simple string value
```

### 1.5 String Quoting Rules

Strings are **unquoted by default**. Quotes are required only when the string contains:
- The active delimiter (comma by default)
- A colon `:`
- Leading or trailing whitespace
- Control characters
- The characters `{`, `}`, `[`, `]`

**Escape sequences** (only these five are valid):
- `\\` → backslash
- `\"` → double quote
- `\n` → newline
- `\r` → carriage return
- `\t` → tab

### 1.6 Number Formatting

TOON requires decimal form without scientific notation:

| Value | JSON | TOON |
|-------|------|------|
| 15 billion | `1.5e10` | `15000000000` |
| Tiny | `1e-10` | `0.0000000001` |
| NaN | N/A | `null` |
| Infinity | N/A | `null` |
| -0 | `-0` | `0` |

### 1.7 Key Folding (Optional)

Chains of single-key wrapper objects can be collapsed:

```toon
user.profile.settings.theme: dark
```

Equivalent to:
```toon
user:
  profile:
    settings:
      theme: dark
```

---

## Part 2: ZIO Schema 2 Architecture

### 2.1 Core Abstractions

ZIO Schema 2 uses a deriver-based architecture where format codecs are derived from `Schema[A]` definitions. The key components are:

```scala
// The schema definition
case class Person(name: String, age: Int)
object Person {
  implicit val schema: Schema[Person] = Schema.derived
}

// Deriving a codec
val jsonCodec: JsonBinaryCodec[Person] = Schema[Person].derive(JsonFormat.deriver)
```

### 2.2 Deriver Trait

The `Deriver[TC[_]]` trait defines how to derive type class instances for different schema shapes:

```scala
trait Deriver[TC[_]] {
  def derivePrimitive[F[_, _], A](
    primitiveType: PrimitiveType[A],
    typeName: TypeName[A],
    binding: Binding[BindingType.Primitive, A],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  ): Lazy[TC[A]]

  def deriveRecord[F[_, _], A](
    fields: IndexedSeq[Term[F, A, ?]],
    typeName: TypeName[A],
    binding: Binding[BindingType.Record, A],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[TC[A]]

  def deriveVariant[F[_, _], A](
    cases: IndexedSeq[Term[F, A, ?]],
    typeName: TypeName[A],
    binding: Binding[BindingType.Variant, A],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[TC[A]]

  def deriveSequence[F[_, _], C[_], A](
    element: Reflect[F, A],
    typeName: TypeName[C[A]],
    binding: Binding[BindingType.Seq[C], C[A]],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[TC[C[A]]]

  def deriveMap[F[_, _], M[_, _], K, V](
    key: Reflect[F, K],
    value: Reflect[F, V],
    typeName: TypeName[M[K, V]],
    binding: Binding[BindingType.Map[M], M[K, V]],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[TC[M[K, V]]]

  def deriveDynamic[F[_, _]](
    binding: Binding[BindingType.Dynamic, DynamicValue],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[TC[DynamicValue]]

  def deriveWrapper[F[_, _], A, B](
    wrapped: Reflect[F, B],
    typeName: TypeName[A],
    wrapperPrimitiveType: Option[PrimitiveType[A]],
    binding: Binding[BindingType.Wrapper[A, B], A],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[TC[A]]
}
```

### 2.3 BinaryCodec Pattern

Codecs extend `BinaryCodec[A]` and work with streaming readers/writers:

```scala
abstract class JsonBinaryCodec[A](val valueType: Int = JsonBinaryCodec.objectType) 
    extends BinaryCodec[A] {
  
  // Core methods to implement
  def decodeValue(in: JsonReader, default: A): A
  def encodeValue(x: A, out: JsonWriter): Unit
  
  // Optional key encoding (for map keys)
  def decodeKey(in: JsonReader): A
  def encodeKey(x: A, out: JsonWriter): Unit
  
  // Null value for initialization
  def nullValue: A = null.asInstanceOf[A]
  
  // Public API
  def decode(input: ByteBuffer, config: ReaderConfig): Either[SchemaError, A]
  def encode(value: A, output: ByteBuffer, config: WriterConfig): Unit
}
```

### 2.4 Configuration Architecture

Configuration is split between two concerns:

**Semantic configuration** lives on the deriver class itself:

```scala
class JsonBinaryCodecDeriver(
  fieldNameMapper: NameMapper,           // Field name transformation
  caseNameMapper: NameMapper,            // Case/variant name transformation  
  discriminatorKind: DiscriminatorKind,  // ADT encoding strategy
  rejectExtraFields: Boolean,            // Fail on unknown fields
  enumValuesAsStrings: Boolean,          // Enum encoding style
  transientNone: Boolean,                // Omit None values
  requireOptionFields: Boolean,          // Require Option fields
  transientEmptyCollection: Boolean,     // Omit empty collections
  requireCollectionFields: Boolean,      // Require collection fields
  transientDefaultValue: Boolean,        // Omit default-valued fields
  requireDefaultValueFields: Boolean     // Require fields with defaults
) extends Deriver[JsonBinaryCodec]
```

**Runtime configuration** lives in separate config classes:

```scala
// ReaderConfig: buffer sizes and parsing behavior
class ReaderConfig(
  val preferredBufSize: Int,      // Default: 32768
  val preferredCharBufSize: Int,  // Default: 4096
  val maxBufSize: Int,            // Default: 33554432
  val maxCharBufSize: Int,        // Default: 4194304
  val checkForEndOfInput: Boolean // Default: true
)

// WriterConfig: output formatting
class WriterConfig(
  val indentionStep: Int,     // Default: 0 (compact)
  val preferredBufSize: Int,  // Default: 32768
  val escapeUnicode: Boolean  // Default: false
)
```

### 2.5 DiscriminatorKind for ADTs

Sum types (sealed traits) support three encoding strategies:

```scala
sealed trait DiscriminatorKind

object DiscriminatorKind {
  // Wrapper object: {"Cat": {"name": "Whiskers"}}
  case object Key extends DiscriminatorKind  // DEFAULT
  
  // Embedded field: {"type": "Cat", "name": "Whiskers"}
  case class Field(name: String) extends DiscriminatorKind
  
  // No discriminator: try each case sequentially
  case object None extends DiscriminatorKind
}
```

### 2.6 NameMapper for Field Transformation

```scala
sealed trait NameMapper extends (String => String)

object NameMapper {
  case object Identity extends NameMapper   // No transformation (default)
  case object SnakeCase extends NameMapper  // memberName → member_name
  case object CamelCase extends NameMapper  // member_name → memberName
  case object PascalCase extends NameMapper // member_name → MemberName
  case object KebabCase extends NameMapper  // memberName → member-name
  case class Custom(f: String => String) extends NameMapper
}
```

### 2.7 Modifier System

ZIO Schema 2 uses `Modifier` classes (not Java annotations) for customization:

```scala
// Rename a field or case
@Modifier.rename("new_name")
case class Example(field: String)

// Add decoding aliases
@Modifier.alias("old_name")
case object Blue extends Color

// Mark field as transient (excluded from serialization)
@Modifier.transient()
val internalField: Int = 0
```

Programmatic application:
```scala
val codec = Color.schema
  .deriving(JsonBinaryCodecDeriver)
  .modifier(Color.red, Modifier.rename("Rose"))
  .modifier(Color.red, Modifier.alias("Ruby"))
  .derive
```

---

## Part 3: TOON Implementation Design

### 3.1 Module Structure

```
zio-blocks/
└── schema-toon/
    └── src/main/scala/zio/blocks/schema/toon/
        ├── ToonFormat.scala           # Format definition object
        ├── ToonBinaryCodec.scala      # Abstract codec class
        ├── ToonBinaryCodecDeriver.scala # Deriver implementation
        ├── ToonReader.scala           # Streaming parser
        ├── ToonWriter.scala           # Streaming serializer
        ├── ReaderConfig.scala         # Parser configuration
        ├── WriterConfig.scala         # Serializer configuration
        ├── ArrayFormat.scala          # TOON-specific array encoding
        └── DiscriminatorKind.scala    # Reuse or extend from JSON
```

### 3.2 ToonFormat Object

```scala
package zio.blocks.schema.toon

import zio.blocks.schema.codec.BinaryFormat

/**
 * The TOON format for ZIO Schema 2.
 * 
 * TOON (Token-Oriented Object Notation) is a compact serialization format
 * optimized for LLM token efficiency, achieving 30-60% reduction vs JSON.
 */
object ToonFormat extends BinaryFormat("application/toon", ToonBinaryCodecDeriver)
```

### 3.3 ArrayFormat Enum

```scala
package zio.blocks.schema.toon

/**
 * Specifies how arrays should be encoded in TOON format.
 */
sealed trait ArrayFormat

object ArrayFormat {
  /**
   * Automatically select the most compact format based on array contents:
   * - Tabular for uniform object arrays with primitive fields
   * - Inline for primitive arrays
   * - List for heterogeneous or nested data
   */
  case object Auto extends ArrayFormat
  
  /**
   * Force tabular format: `items[N]{field1,field2}: val1,val2`
   * Falls back to List if array is not tabular-eligible.
   */
  case object Tabular extends ArrayFormat
  
  /**
   * Force inline format: `items[N]: val1,val2,val3`
   * Only valid for primitive arrays.
   */
  case object Inline extends ArrayFormat
  
  /**
   * Force list format with `- ` markers.
   */
  case object List extends ArrayFormat
}
```

### 3.4 ToonBinaryCodecDeriver

```scala
package zio.blocks.schema.toon

import zio.blocks.schema._
import zio.blocks.schema.binding._
import zio.blocks.schema.codec.BinaryFormat
import zio.blocks.schema.derive._
import zio.blocks.schema.json.{DiscriminatorKind, NameMapper}

/**
 * Default TOON deriver with standard settings.
 */
object ToonBinaryCodecDeriver extends ToonBinaryCodecDeriver(
  fieldNameMapper = NameMapper.Identity,
  caseNameMapper = NameMapper.Identity,
  discriminatorKind = DiscriminatorKind.Key,
  arrayFormat = ArrayFormat.Auto,
  delimiter = ',',
  rejectExtraFields = false,
  enumValuesAsStrings = true,
  transientNone = true,
  requireOptionFields = false,
  transientEmptyCollection = true,
  requireCollectionFields = false,
  transientDefaultValue = true,
  requireDefaultValueFields = false,
  enableKeyFolding = false
)

/**
 * Deriver for TOON binary codecs with configurable behavior.
 *
 * @param fieldNameMapper       Transform strategy for field names
 * @param caseNameMapper        Transform strategy for variant case names  
 * @param discriminatorKind     ADT encoding strategy (Key, Field, None)
 * @param arrayFormat           Array encoding preference (Auto, Tabular, Inline, List)
 * @param delimiter             Value separator in tabular/inline arrays (comma default)
 * @param rejectExtraFields     Fail decoding on unrecognized fields
 * @param enumValuesAsStrings   Encode case object enums as strings
 * @param transientNone         Omit None-valued Option fields
 * @param requireOptionFields   Require Option fields to be present
 * @param transientEmptyCollection  Omit empty collection fields
 * @param requireCollectionFields   Require collection fields to be present
 * @param transientDefaultValue     Omit fields matching their default value
 * @param requireDefaultValueFields Require fields with defaults to be present
 * @param enableKeyFolding      Enable dotted key path expansion
 */
class ToonBinaryCodecDeriver private[toon] (
  fieldNameMapper: NameMapper,
  caseNameMapper: NameMapper,
  discriminatorKind: DiscriminatorKind,
  arrayFormat: ArrayFormat,
  delimiter: Char,
  rejectExtraFields: Boolean,
  enumValuesAsStrings: Boolean,
  transientNone: Boolean,
  requireOptionFields: Boolean,
  transientEmptyCollection: Boolean,
  requireCollectionFields: Boolean,
  transientDefaultValue: Boolean,
  requireDefaultValueFields: Boolean,
  enableKeyFolding: Boolean
) extends Deriver[ToonBinaryCodec] {

  // Builder methods
  def withFieldNameMapper(mapper: NameMapper): ToonBinaryCodecDeriver =
    copy(fieldNameMapper = mapper)
    
  def withCaseNameMapper(mapper: NameMapper): ToonBinaryCodecDeriver =
    copy(caseNameMapper = mapper)
    
  def withDiscriminatorKind(kind: DiscriminatorKind): ToonBinaryCodecDeriver =
    copy(discriminatorKind = kind)
    
  def withArrayFormat(format: ArrayFormat): ToonBinaryCodecDeriver =
    copy(arrayFormat = format)
    
  def withDelimiter(delim: Char): ToonBinaryCodecDeriver =
    copy(delimiter = delim)
    
  def withRejectExtraFields(reject: Boolean): ToonBinaryCodecDeriver =
    copy(rejectExtraFields = reject)
    
  def withEnumValuesAsStrings(asStrings: Boolean): ToonBinaryCodecDeriver =
    copy(enumValuesAsStrings = asStrings)
    
  def withTransientNone(transient: Boolean): ToonBinaryCodecDeriver =
    copy(transientNone = transient)
    
  def withKeyFolding(enabled: Boolean): ToonBinaryCodecDeriver =
    copy(enableKeyFolding = enabled)

  // ... additional builder methods ...

  private def copy(
    fieldNameMapper: NameMapper = fieldNameMapper,
    caseNameMapper: NameMapper = caseNameMapper,
    discriminatorKind: DiscriminatorKind = discriminatorKind,
    arrayFormat: ArrayFormat = arrayFormat,
    delimiter: Char = delimiter,
    rejectExtraFields: Boolean = rejectExtraFields,
    enumValuesAsStrings: Boolean = enumValuesAsStrings,
    transientNone: Boolean = transientNone,
    requireOptionFields: Boolean = requireOptionFields,
    transientEmptyCollection: Boolean = transientEmptyCollection,
    requireCollectionFields: Boolean = requireCollectionFields,
    transientDefaultValue: Boolean = transientDefaultValue,
    requireDefaultValueFields: Boolean = requireDefaultValueFields,
    enableKeyFolding: Boolean = enableKeyFolding
  ): ToonBinaryCodecDeriver = new ToonBinaryCodecDeriver(
    fieldNameMapper, caseNameMapper, discriminatorKind, arrayFormat,
    delimiter, rejectExtraFields, enumValuesAsStrings, transientNone,
    requireOptionFields, transientEmptyCollection, requireCollectionFields,
    transientDefaultValue, requireDefaultValueFields, enableKeyFolding
  )

  // Deriver implementation
  override def derivePrimitive[F[_, _], A](
    primitiveType: PrimitiveType[A],
    typeName: TypeName[A],
    binding: Binding[BindingType.Primitive, A],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  ): Lazy[ToonBinaryCodec[A]] = Lazy {
    // Implementation: return appropriate codec for primitive type
    ???
  }

  override def deriveRecord[F[_, _], A](
    fields: IndexedSeq[Term[F, A, ?]],
    typeName: TypeName[A],
    binding: Binding[BindingType.Record, A],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[ToonBinaryCodec[A]] = Lazy {
    // Implementation: derive codec for case class / record
    ???
  }

  override def deriveVariant[F[_, _], A](
    cases: IndexedSeq[Term[F, A, ?]],
    typeName: TypeName[A],
    binding: Binding[BindingType.Variant, A],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[ToonBinaryCodec[A]] = Lazy {
    // Implementation: derive codec for sealed trait / enum
    // Handle discriminatorKind, enumValuesAsStrings, caseNameMapper
    ???
  }

  override def deriveSequence[F[_, _], C[_], A](
    element: Reflect[F, A],
    typeName: TypeName[C[A]],
    binding: Binding[BindingType.Seq[C], C[A]],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[ToonBinaryCodec[C[A]]] = Lazy {
    // Implementation: derive codec for sequences
    // Key TOON logic: select array format based on arrayFormat setting
    // and element uniformity analysis
    ???
  }

  override def deriveMap[F[_, _], M[_, _], K, V](
    key: Reflect[F, K],
    value: Reflect[F, V],
    typeName: TypeName[M[K, V]],
    binding: Binding[BindingType.Map[M], M[K, V]],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[ToonBinaryCodec[M[K, V]]] = Lazy {
    // Implementation: derive codec for maps
    ???
  }

  override def deriveDynamic[F[_, _]](
    binding: Binding[BindingType.Dynamic, DynamicValue],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[ToonBinaryCodec[DynamicValue]] = Lazy {
    // Implementation: derive codec for dynamic values
    ???
  }

  override def deriveWrapper[F[_, _], A, B](
    wrapped: Reflect[F, B],
    typeName: TypeName[A],
    wrapperPrimitiveType: Option[PrimitiveType[A]],
    binding: Binding[BindingType.Wrapper[A, B], A],
    doc: Doc,
    modifiers: Seq[Modifier.Reflect]
  )(implicit F: HasBinding[F], D: HasInstance[F]): Lazy[ToonBinaryCodec[A]] = Lazy {
    // Implementation: derive codec for wrapper types (newtypes)
    ???
  }
}
```

### 3.5 ToonBinaryCodec

```scala
package zio.blocks.schema.toon

import zio.blocks.schema.SchemaError
import zio.blocks.schema.codec.BinaryCodec
import java.nio.ByteBuffer

/**
 * Abstract codec for TOON encoding/decoding.
 *
 * @param valueType Optimization hint for primitive types
 */
abstract class ToonBinaryCodec[A](val valueType: Int = ToonBinaryCodec.objectType) 
    extends BinaryCodec[A] {

  /**
   * Decode a value from a TOON reader.
   *
   * @param in      The TOON reader providing input
   * @param default Default value for initialization
   * @return The decoded value
   */
  def decodeValue(in: ToonReader, default: A): A

  /**
   * Encode a value to a TOON writer.
   *
   * @param x   The value to encode
   * @param out The TOON writer for output
   */
  def encodeValue(x: A, out: ToonWriter): Unit

  /**
   * Decode a value used as a map key.
   */
  def decodeKey(in: ToonReader): A = 
    in.decodeError("decoding as TOON key is not supported")

  /**
   * Encode a value as a map key.
   */
  def encodeKey(x: A, out: ToonWriter): Unit = 
    out.encodeError("encoding as TOON key is not supported")

  /**
   * The null/default value for this type.
   */
  def nullValue: A = null.asInstanceOf[A]

  // Public API
  override def decode(input: ByteBuffer): Either[SchemaError, A] = 
    decode(input, ToonReaderConfig)

  override def encode(value: A, output: ByteBuffer): Unit = 
    encode(value, output, ToonWriterConfig)

  def decode(input: ByteBuffer, config: ToonReaderConfig): Either[SchemaError, A]
  
  def encode(value: A, output: ByteBuffer, config: ToonWriterConfig): Unit

  // Convenience methods for byte arrays and strings
  def decodeFromString(input: String): Either[SchemaError, A]
  def encodeToString(value: A): String
}

object ToonBinaryCodec {
  val objectType  = 0
  val intType     = 1
  val longType    = 2
  val floatType   = 3
  val doubleType  = 4
  val booleanType = 5
  val byteType    = 6
  val charType    = 7
  val shortType   = 8
  val unitType    = 9
  
  // Predefined primitive codecs
  val unitCodec: ToonBinaryCodec[Unit] = ???
  val booleanCodec: ToonBinaryCodec[Boolean] = ???
  val byteCodec: ToonBinaryCodec[Byte] = ???
  val shortCodec: ToonBinaryCodec[Short] = ???
  val intCodec: ToonBinaryCodec[Int] = ???
  val longCodec: ToonBinaryCodec[Long] = ???
  val floatCodec: ToonBinaryCodec[Float] = ???
  val doubleCodec: ToonBinaryCodec[Double] = ???
  val charCodec: ToonBinaryCodec[Char] = ???
  val stringCodec: ToonBinaryCodec[String] = ???
  val bigIntCodec: ToonBinaryCodec[BigInt] = ???
  val bigDecimalCodec: ToonBinaryCodec[BigDecimal] = ???
  // ... java.time codecs, UUID, Currency, etc.
}
```

### 3.6 Configuration Classes

```scala
package zio.blocks.schema.toon

/**
 * Configuration for ToonReader.
 *
 * @param preferredBufSize     Preferred byte buffer size
 * @param preferredCharBufSize Preferred char buffer size  
 * @param maxBufSize           Maximum byte buffer size
 * @param maxCharBufSize       Maximum char buffer size
 * @param checkForEndOfInput   Verify no trailing content after parsing
 * @param strictArrayLength    Validate array length markers match actual count
 */
class ToonReaderConfig private (
  val preferredBufSize: Int,
  val preferredCharBufSize: Int,
  val maxBufSize: Int,
  val maxCharBufSize: Int,
  val checkForEndOfInput: Boolean,
  val strictArrayLength: Boolean
) extends Serializable {
  def withStrictArrayLength(strict: Boolean): ToonReaderConfig =
    copy(strictArrayLength = strict)
  // ... other builder methods
}

object ToonReaderConfig extends ToonReaderConfig(
  preferredBufSize = 32768,
  preferredCharBufSize = 4096,
  maxBufSize = 33554432,
  maxCharBufSize = 4194304,
  checkForEndOfInput = true,
  strictArrayLength = true
)

/**
 * Configuration for ToonWriter.
 *
 * @param indentSize       Spaces per indentation level (default: 2)
 * @param preferredBufSize Preferred output buffer size
 * @param lineEnding       Line ending style (LF recommended per spec)
 */
class ToonWriterConfig private (
  val indentSize: Int,
  val preferredBufSize: Int,
  val lineEnding: String
) extends Serializable {
  def withIndentSize(size: Int): ToonWriterConfig =
    copy(indentSize = size)
  // ... other builder methods
}

object ToonWriterConfig extends ToonWriterConfig(
  indentSize = 2,
  preferredBufSize = 32768,
  lineEnding = "\n"
)
```

---

## Part 4: Encoding Rules and Algorithms

### 4.1 Array Format Selection Algorithm

When `ArrayFormat.Auto` is configured, the encoder must analyze array contents:

```scala
def selectArrayFormat[A](elements: Iterable[A], elementCodec: ToonBinaryCodec[A]): ArrayFormat = {
  if (elements.isEmpty) {
    ArrayFormat.Inline  // Empty arrays: items[0]:
  } else if (isPrimitiveCodec(elementCodec)) {
    ArrayFormat.Inline  // Primitive arrays: items[3]: a,b,c
  } else if (isUniformObjectArray(elements)) {
    ArrayFormat.Tabular // Uniform objects: items[N]{fields}: rows...
  } else {
    ArrayFormat.List    // Everything else: - item format
  }
}

def isUniformObjectArray[A](elements: Iterable[A]): Boolean = {
  // Check that:
  // 1. All elements are objects (case classes)
  // 2. All have identical field names in same order
  // 3. All field values are primitives (not nested objects/arrays)
  ???
}
```

### 4.2 String Encoding Rules

```scala
def requiresQuoting(s: String, delimiter: Char): Boolean = {
  s.isEmpty ||
  s.charAt(0).isWhitespace ||
  s.charAt(s.length - 1).isWhitespace ||
  s.indexOf(delimiter) >= 0 ||
  s.indexOf(':') >= 0 ||
  s.indexOf('{') >= 0 ||
  s.indexOf('}') >= 0 ||
  s.indexOf('[') >= 0 ||
  s.indexOf(']') >= 0 ||
  containsControlCharacters(s)
}

def encodeString(s: String, delimiter: Char, out: ToonWriter): Unit = {
  if (requiresQuoting(s, delimiter)) {
    out.writeQuotedString(s)  // Escape \, ", \n, \r, \t
  } else {
    out.writeRawString(s)
  }
}
```

### 4.3 Number Encoding Rules

```scala
def encodeNumber(n: BigDecimal, out: ToonWriter): Unit = {
  if (n.isNaN || n.isInfinity) {
    out.writeNull()
  } else if (n == BigDecimal(0) && n.signum < 0) {
    out.writeRaw("0")  // Normalize -0 to 0
  } else {
    // Convert to non-exponential decimal form
    out.writeRaw(n.bigDecimal.toPlainString)
  }
}
```

### 4.4 ADT Encoding with Discriminators

**DiscriminatorKind.Key (default):**
```toon
Cat:
  name: Whiskers
  lives: 9
```

**DiscriminatorKind.Field("type"):**
```toon
type: Cat
name: Whiskers
lives: 9
```

**DiscriminatorKind.None:**
```toon
name: Whiskers
lives: 9
```
(Decoder tries each case sequentially)

### 4.5 Tabular Array Encoding

For uniform object arrays:

```scala
def encodeTabularArray[A](
  fieldName: String,
  elements: IndexedSeq[A],
  fieldNames: IndexedSeq[String],
  fieldCodecs: IndexedSeq[ToonBinaryCodec[?]],
  out: ToonWriter
): Unit = {
  // Header: fieldName[count]{field1,field2,...}:
  out.writeRaw(fieldName)
  out.writeRaw("[")
  out.writeRaw(elements.length.toString)
  out.writeRaw("]{")
  out.writeRaw(fieldNames.mkString(","))
  out.writeRaw("}:")
  out.newLine()
  
  // Rows: value1,value2,...
  elements.foreach { element =>
    out.writeIndent()
    fieldCodecs.zipWithIndex.foreach { case (codec, idx) =>
      if (idx > 0) out.writeRaw(",")
      codec.encodeValue(getField(element, idx), out)
    }
    out.newLine()
  }
}
```

---

## Part 5: Acceptance Criteria

### 5.1 Functional Requirements

#### Primitive Types
- [ ] All primitive types encode/decode correctly: Unit, Boolean, Byte, Short, Int, Long, Float, Double, Char, String, BigInt, BigDecimal
- [ ] All java.time types: Instant, LocalDate, LocalTime, LocalDateTime, OffsetDateTime, ZonedDateTime, Duration, Period, Year, YearMonth, MonthDay, Month, DayOfWeek, ZoneId, ZoneOffset
- [ ] UUID and Currency types
- [ ] Numbers use decimal form (no scientific notation)
- [ ] NaN and Infinity encode as `null`
- [ ] -0 normalizes to 0

#### Strings
- [ ] Unquoted strings work for simple values
- [ ] Quoted strings handle delimiters, colons, whitespace, control characters
- [ ] Only valid escape sequences: `\\`, `\"`, `\n`, `\r`, `\t`
- [ ] UTF-8 encoding with LF line endings

#### Arrays
- [ ] ArrayFormat.Auto selects optimal format
- [ ] Tabular format for uniform object arrays
- [ ] Inline format for primitive arrays
- [ ] List format for heterogeneous data
- [ ] Array length markers `[N]` are accurate
- [ ] Empty arrays encode correctly: `items[0]:`
- [ ] Custom delimiter support (comma, tab, pipe)

#### Objects/Records
- [ ] Indentation-based nesting works correctly
- [ ] Field name transformation via NameMapper
- [ ] Transient field handling (None, empty collections, defaults)
- [ ] Required field validation
- [ ] Extra field rejection (configurable)
- [ ] Modifier.rename and Modifier.alias support

#### ADTs/Variants
- [ ] DiscriminatorKind.Key (wrapper object) works
- [ ] DiscriminatorKind.Field embeds discriminator
- [ ] DiscriminatorKind.None tries cases sequentially
- [ ] Case name transformation via NameMapper
- [ ] enumValuesAsStrings for case object enums
- [ ] Nested ADTs work correctly
- [ ] Modifier.rename and Modifier.alias on cases

#### Maps
- [ ] String-keyed maps encode as objects
- [ ] Non-string-keyed maps use array of pairs or error

#### Wrappers/Newtypes
- [ ] Wrapper types encode as their underlying type
- [ ] Validation on decode (partial wrappers)

#### DynamicValue
- [ ] Full DynamicValue support for schema-less data

### 5.2 Non-Functional Requirements

#### Performance
- [ ] Zero-allocation encoding for primitives (use value types)
- [ ] Streaming encode/decode (no full materialization)
- [ ] Buffer reuse via thread-local pools
- [ ] Comparable performance to JSON codec

#### Compatibility
- [ ] Cross-platform: JVM, Scala.js, Scala Native
- [ ] Scala 2.13 and Scala 3 support
- [ ] No runtime reflection

#### Specification Compliance
- [ ] UTF-8 output with LF line endings
- [ ] Consistent indentation (configurable, default 2 spaces)
- [ ] No trailing whitespace
- [ ] No trailing newline
- [ ] Accurate array length markers
- [ ] Preserve object key order

### 5.3 Test Coverage

#### Unit Tests
- [ ] All primitive codecs round-trip correctly
- [ ] All array formats encode/decode correctly
- [ ] All discriminator kinds work
- [ ] All NameMapper variants work
- [ ] Error messages include path information
- [ ] Edge cases: empty strings, empty arrays, empty objects, deeply nested structures

#### Property-Based Tests
- [ ] Arbitrary case classes round-trip
- [ ] Arbitrary sealed traits round-trip
- [ ] JSON↔TOON conversion is lossless

#### Integration Tests
- [ ] Large documents (>1MB)
- [ ] Deeply nested structures (>100 levels)
- [ ] Wide objects (>100 fields)
- [ ] Unicode content

### 5.4 Documentation

- [ ] Scaladoc on all public APIs
- [ ] Usage examples in tests
- [ ] README with quick start guide
- [ ] Configuration reference

---

## Part 6: Reference Implementation Notes

### 6.1 Existing TOON Libraries

**toon4s** (github.com/vim89/toon4s) provides a Scala TOON implementation with:
- Sealed ADT for TOON values: `ToonValue = TNull | TBool | TNumber | TString | TArray | TObj`
- JSON↔TOON bidirectional conversion
- Does NOT provide automatic derivation for case classes

**TypeScript SDK** (github.com/toon-format/toon) is the reference implementation with:
- Complete parser and serializer
- Schema-aware encoding
- Comprehensive test suite

### 6.2 JSON Codec Reference

The `JsonBinaryCodecDeriver` in zio-blocks serves as the primary reference for implementation patterns:
- Thread-local caching for recursive types
- Field info classes for optimized encoding
- String map for O(1) field lookup during decoding
- Specialized codecs for primitive arrays

### 6.3 Test Data

The TOON specification repository includes a test suite at `github.com/toon-format/spec/tree/main/tests` with:
- Valid TOON documents
- Invalid TOON documents with expected errors
- JSON↔TOON conversion pairs

---

## Appendix A: Example Encodings

### Simple Record
```scala
case class Person(name: String, age: Int)
val person = Person("Alice", 30)
```

**TOON:**
```toon
name: Alice
age: 30
```

### Nested Record
```scala
case class Address(street: String, city: String)
case class Person(name: String, address: Address)
val person = Person("Alice", Address("123 Main", "Springfield"))
```

**TOON:**
```toon
name: Alice
address:
  street: 123 Main
  city: Springfield
```

### Uniform Array (Tabular)
```scala
case class User(id: Int, name: String)
val users = List(User(1, "Alice"), User(2, "Bob"))
```

**TOON:**
```toon
[2]{id,name}:
  1,Alice
  2,Bob
```

### Sealed Trait (Key Discriminator)
```scala
sealed trait Pet
case class Cat(name: String, lives: Int) extends Pet
case class Dog(name: String, breed: String) extends Pet

val pet: Pet = Cat("Whiskers", 9)
```

**TOON:**
```toon
Cat:
  name: Whiskers
  lives: 9
```

### Sealed Trait (Field Discriminator)
```scala
// With: .withDiscriminatorKind(DiscriminatorKind.Field("type"))
```

**TOON:**
```toon
type: Cat
name: Whiskers
lives: 9
```

### Case Object Enum
```scala
sealed trait Color
case object Red extends Color
case object Green extends Color
case object Blue extends Color

val color: Color = Green
```

**TOON (enumValuesAsStrings = true, default):**
```toon
Green
```

**TOON (enumValuesAsStrings = false):**
```toon
Green:
```

### Option Types
```scala
case class Config(name: String, timeout: Option[Int])
val config = Config("app", Some(30))
```

**TOON (transientNone = true, default):**
```toon
name: app
timeout: 30
```

**TOON (None value, transientNone = true):**
```toon
name: app
```

---

## Appendix B: Error Messages

Error messages should follow the JSON codec pattern with path information:

```
illegal number with leading zero at: .users[2].age
missing required field "name" at: .config
illegal discriminator at: .event
expected '}' or ',' at: .response.data
unexpected field "extra" at: .request  (when rejectExtraFields = true)
array length mismatch: expected 3, got 2 at: .items  (when strictArrayLength = true)
```

---

## Appendix C: Configuration Quick Reference

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `fieldNameMapper` | `NameMapper` | `Identity` | Field name transformation |
| `caseNameMapper` | `NameMapper` | `Identity` | Case name transformation |
| `discriminatorKind` | `DiscriminatorKind` | `Key` | ADT encoding strategy |
| `arrayFormat` | `ArrayFormat` | `Auto` | Array encoding preference |
| `delimiter` | `Char` | `,` | Array value separator |
| `rejectExtraFields` | `Boolean` | `false` | Fail on unknown fields |
| `enumValuesAsStrings` | `Boolean` | `true` | Case objects as strings |
| `transientNone` | `Boolean` | `true` | Omit None values |
| `requireOptionFields` | `Boolean` | `false` | Require Option fields |
| `transientEmptyCollection` | `Boolean` | `true` | Omit empty collections |
| `requireCollectionFields` | `Boolean` | `false` | Require collections |
| `transientDefaultValue` | `Boolean` | `true` | Omit default values |
| `requireDefaultValueFields` | `Boolean` | `false` | Require default fields |
| `enableKeyFolding` | `Boolean` | `false` | Dotted key expansion |

Option	Type	Default	Description
`fieldNameMapper`	`NameMapper`	`Identity`	Field name transformation
`caseNameMapper`	`NameMapper`	`Identity`	Case name transformation
`discriminatorKind`	`DiscriminatorKind`	`Key`	ADT encoding strategy
`arrayFormat`	`ArrayFormat`	`Auto`	Array encoding preference
`delimiter`	`Char`	`,`	Array value separator
`rejectExtraFields`	`Boolean`	`false`	Fail on unknown fields
`enumValuesAsStrings`	`Boolean`	`true`	Case objects as strings
`transientNone`	`Boolean`	`true`	Omit None values
`requireOptionFields`	`Boolean`	`false`	Require Option fields
`transientEmptyCollection`	`Boolean`	`true`	Omit empty collections
`requireCollectionFields`	`Boolean`	`false`	Require collections
`transientDefaultValue`	`Boolean`	`true`	Omit default values
`requireDefaultValueFields`	`Boolean`	`false`	Require default fields
`enableKeyFolding`	`Boolean`	`false`	Dotted key expansion

Type	TOON Representation	Example
String	Unquoted (default) or quoted	hello or "hello, world"
Number	Decimal form only (no scientific notation)	42, 3.14159
Boolean	Lowercase keywords	true, false
Null	Keyword	null
Array	Three formats (see §1.4)	items[3]: a,b,c
Object	Indentation-based nesting	See §1.3

Type	TOON Representation	Example
String	Unquoted (default) or quoted	`hello` or `"hello, world"`
Number	Decimal form only (no scientific notation)	`42`, `3.14159`
Boolean	Lowercase keywords	`true`, `false`
Null	Keyword	`null`
Array	Three formats (see §1.4)	`items[3]: a,b,c`
Object	Indentation-based nesting	See §1.3

Value	JSON	TOON
15 billion	`1.5e10`	`15000000000`
Tiny	`1e-10`	`0.0000000001`
NaN	N/A	`null`
Infinity	N/A	`null`
-0	`-0`	`0`

Add TOON support to ZIO Schema 2 #654

Description

TOON Format Implementation Guide for ZIO Schema 2

Executive Summary

Part 1: TOON Format Specification

1.1 Overview

1.2 Data Types

Executive Summary

Part 1: TOON Format Specification

1.1 Overview

1.2 Data Types

1.3 Object Encoding

1.4 Array Encoding Formats

Tabular Format (Maximum Compression)

Inline Format (Primitive Arrays)

List Format (Heterogeneous Data)

1.5 String Quoting Rules

1.6 Number Formatting

1.7 Key Folding (Optional)

Part 2: ZIO Schema 2 Architecture

2.1 Core Abstractions

2.2 Deriver Trait

2.3 BinaryCodec Pattern

2.4 Configuration Architecture

2.5 DiscriminatorKind for ADTs

2.6 NameMapper for Field Transformation

2.7 Modifier System

Part 3: TOON Implementation Design

3.1 Module Structure

3.2 ToonFormat Object

3.3 ArrayFormat Enum

3.4 ToonBinaryCodecDeriver

3.5 ToonBinaryCodec

3.6 Configuration Classes

Part 4: Encoding Rules and Algorithms

4.1 Array Format Selection Algorithm

4.2 String Encoding Rules

4.3 Number Encoding Rules

4.4 ADT Encoding with Discriminators

4.5 Tabular Array Encoding

Part 5: Acceptance Criteria

5.1 Functional Requirements

Primitive Types

Strings

Arrays

Objects/Records

ADTs/Variants

Maps

Wrappers/Newtypes

DynamicValue

5.2 Non-Functional Requirements

Performance

Compatibility

Specification Compliance

5.3 Test Coverage

Unit Tests

Property-Based Tests

Integration Tests

5.4 Documentation

Part 6: Reference Implementation Notes

6.1 Existing TOON Libraries

6.2 JSON Codec Reference

6.3 Test Data

Appendix A: Example Encodings

Simple Record

Nested Record

Uniform Array (Tabular)

Sealed Trait (Key Discriminator)

Sealed Trait (Field Discriminator)

Case Object Enum

Option Types

Appendix B: Error Messages

Appendix C: Configuration Quick Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone