Skip to content

Commit

Permalink
math
Browse files Browse the repository at this point in the history
  • Loading branch information
QianC95 committed Oct 21, 2024
1 parent c0f87d6 commit 01df71f
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 2 deletions.
9 changes: 9 additions & 0 deletions source/3D Location Encoder/NeRF.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,15 @@ The `NERFSpatialRelationLocationEncoder` is designed to compute spatial embeddin
<img src="../images/NeRF.png" alt="NeRF-transformation" title="NeRF-transformation" width="60%" />
</p>

Encoded \( x \) = \(\bigoplus_{i=0}^{L-1}\) \([ \sin(2^i \pi x), \cos(2^i \pi x)]\)

Encoded \( y \) = \(\bigoplus_{i=0}^{L-1}\) \([ \sin(2^i \pi y), \cos(2^i \pi y)]\)

Encoded \( z \) = \(\bigoplus_{i=0}^{L-1}\) \([ \sin(2^i \pi z), \cos(2^i \pi z)]\)

Where ⊕ denotes concatenation of vectors.


### Configuration Parameters
- **coord_dim**: Dimensionality of the space being encoded (e.g., 2D, 3D).
- **frequency_num**: Number of different sinusoidal frequencies used to encode spatial differences.
Expand Down
13 changes: 11 additions & 2 deletions source/3D Location Encoder/xyz.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,18 @@ Processes a batch of coordinates and converts them into spatial relation embeddi
- **Formulas:**
- Convert latitude `lat` and longitude `lon` coordinates into radians.
- Calculate `x, y, z` coordinates using the following equations:
<p align="center">
<img src="https://drive.google.com/uc?id=1tviSk-NbxB0G8fTx5vYpkvBJXmiBK08g" alt="xyz-transformation" title="xyz-transformation" width="80%" />

$$x = \cos(lat) \times \cos(lon)$$
$$y = \cos(lat) \times \sin(lon)$$
$$z = \sin(lat)$$
</p>

Where:

- *lat* is the latitude coordinate in radians.
- *lon* is the longitude coordinate in radians.
- *x*, *y*, *z* are the resulting Cartesian coordinates.

- Concatenate `x, y, z` coordinates to form the high-dimensional vector representation.

- **Returns:**
Expand Down
4 changes: 4 additions & 0 deletions source/Basic Concepts/Single point location encoder.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ output:
<img src="../images/single_location_encoder_structure.png" alt="Location Encoder Structure" title="General Structure of Single Location Encoder" width="30%" />
</p>

$$
Enc(\mathbf{x}) = \mathbf{NN}(PE(\mathbf{x}))
$$


## EncoderMultiLayerFeedForwardNN()
`NN(⋅) : ℝ^W -> ℝ^d` is a learnable neural network component which maps the input position embedding `PE(x) ∈ ℝ^W` into the location embedding `Enc(x) ∈ ℝ^d`. A common practice is to define `NN(⋅)` as a multi-layer perceptron, while Mac Aodha et al. (2019) adopted a more complex `NN(⋅)` which includes an initial fully connected layer, followed by a series of residual blocks. The purpose of `NN(⋅)` is to provide a learnable component for the location encoder, which captures the complex interaction between input locations and target labels.
Expand Down

0 comments on commit 01df71f

Please sign in to comment.