Replies: 7 comments
-
@pwalsh 👍 here. My only though is the relation of this to logical vs physical model. This is really part of the logical model not the physical model. If we did have a "distinct" model object like in FDP then it might make sense for it to go on that reather than tableschema. |
Beta Was this translation helpful? Give feedback.
-
I added a short explanation re physical and logical. |
Beta Was this translation helpful? Give feedback.
-
@pwalsh 👍 on this. Think it is valuable - and people can just ignore it which gives graceful degradation for non-supporting systems. |
Beta Was this translation helpful? Give feedback.
-
What should be the behaviour of exporting a datapackage that has a virtual column to a XLS, for example? Should these columns be added to the exported file? Similarly, when iterating over a datapackage's resource with a virtual column, should the library add these columns to the returned rows? Are these questions something we want to define in the spec at all? |
Beta Was this translation helpful? Give feedback.
-
@vitorbaptista I don't think we want to have that in the spec, but perhaps has implementation recommendations. What do you think? |
Beta Was this translation helpful? Give feedback.
-
To add a real world example, I have a large collection of tide gauge data with constants stored in a readme. The I agree with Vitor's questions. I think implementation guidance is needed and also publisher guidance e.g. when should you add columns to the data vs add constants to the schema. |
Beta Was this translation helpful? Give feedback.
-
We ended up taking a different approach in the fiscal data package (this is the working draft https://hackmd.io/BwNgpgrCDGDsBMBaAhtALARkWsPEE5posR8RxgAzffWfDIA=?view), in which we don't modify the table schema itself but rather use another property in the descriptor for that. |
Beta Was this translation helpful? Give feedback.
-
Context
Data sources don’t always contain all the data necessary to use them properly and simply.
For example, some yearly budget files in specific countries won’t have a column with the fiscal year or the country code. Publishers for such files assume that anyone downloading the file would know which year and country this datasets belongs to. However, when adding such datasets into global repositories of fiscal data, the omission of this data from the actual rows becomes evident.
This problem is resolved by introducing the constant property for table schema fields. If this property exists and contains a value, then this field will be ignored when reading the data from the data source, and it will be added afterward with the correct data type, etc.
Example:
Implementation
Fiscal Data Package already has a
constant
property at the level of themodel
(an abstraction layer above one or manyresources
). With changes we are implementing on Fiscal Data Package, it makes more sense to move constant the level of Table Schema Fields, essentially, introducing the idea of virtual columns. We are doing this for Fiscal Data Package in any event, but we do think this is generally useful for Table Schema in general, and we would like to see this implementation of contents represented with virtual columns land in Table Schema v1.1.The constant value can either be a logical value (when possible) or a physical value, in which case
type
andformat
rules will apply as usual.Beta Was this translation helpful? Give feedback.
All reactions