Skip to content
Nick Ruest edited this page Sep 1, 2016 · 27 revisions

Time/Place

This meeting is a hybrid teleconference and IRC chat. Anyone is welcome to join. Here is the info:

Attendees

  • Nick Ruest
  • Christina Harlow ⭐
  • Joshua Westgard
  • Simone Sacchi
  • Kevin Ford
  • Andrew Woods
  • Trey Pendragon
  • Mark Matienzo
  • Julie Hardesty, others from Indiana (Heidi Dowding, Amol Khedkar, William Cowan)
  • Adam Wead
  • Esmé Cowles
  • Melissa Anez
  • Stefano Cossu
  • Julie Allinson
  • Karen Estlund
  • Bethany Seeger
  • Jared Whiklo
  • Andrew Woods
  • Justin Simpson
  • Mike Giarlo
  • Steve Van Tuyl

Agenda

  1. Introductions
  2. Repeating calls
  • Cleaning out issues
  • Deciding on releases
  1. PCDM 2.0 -- <context=interoperability>

</context> 4. ...feel free to add additional agenda items

Minutes

###1. Introductions

  • Kept brief due to number of attendees.

###2a. Repeating calls: Frequency? What time?

  • Time: Andrew: This time right now? Thursdays at Noon EST
    • Andrew: Worth re-doodling? or this time works?
    • With amount of folks, just say its a good time
    • Clarification: First thursday of the month - Agreed.
  • Frequency: Andrew: Monthly (i.e. first Thursday of each month) came up on GitHub issues
    • Andrew: Number of folks agreed with Monthly there
    • Karen: allows time to discuss issues online and with community
  • Next call: October 6th at Noon eastern
    • ??: This is the week of HydraConnect. Falls in middle of lunch slot
    • Mike: Suspect few folks could make it, should be okay
    • Andrew: yes, maybe it can be an opportunity that during Hydraconnect, during lunch, for new/other folks to join a PCDM tech call
  • Scope of these calls: Stefano: What is the scope of this call? Rolling out 2.0?
    • Karen: getting folks together to discuss any issues
    • Andrew, others: agree
    • Stefano: so ongoing telephone call about current issues, not related to any specific release or topic. Just PCDM Tech.
    • Karen: Yes
    • Nick: Community-driven call, folks can add items to agenda as see fit, rotate folks chairing the call, taking notes, and work through issues

###2b. Topics, cleaning out issues for calls, Releases == Became, Interoperability Discussion

  • Nick: So we'll just covers topics, the scope discussion above. Lots of stuff to discuss for 2.0
  • Andrew: Highlighting idea of interoperability:
    • Remembering a goal of interoperability, thinking about 2 different systems using PCDM ontologies and could you put your system on top of another PCDM repository and make sense out of what you see there.
    • Haven’t decided what level of interoperability makes sense for this discussion, this is critical for being factor for driving decisions on granular issues.
    • ??: need to remember PCDM and model interoperability, remember the general framework for the interoperability, but also the ontology being interoperable is important in itself.
  • Nick: For Islandora, discussion point has been what does interoperability mean? This is being about to publish and consume PCDM graphs, but then what is compliance, and how do you determine that?
    • Andrew: and what do you mean by consume
    • Nick: Say another repository is publishing PCDM name graphs, and Islandora is consuming it, what is constructive consumption?
    • Karen: On the question of the interoperability of PCDM, early on in the PCDM work, is was about the interoperability of sharing code. We'd know that it is successful if we can share code, not just objects. There is ORE out there already for sharing objects. The originating concern was that there are people developing great tools, but folks couldn't share code between similar projects.
      • However, now, the scope, rhetoric, methodology has changed a bit.
      • We should think about this a bit more.
    • Stefano?: Can you give an example of the code you're referring to?
      • Karen: In Hydra-Works days, thinking that Penn State, Oregon, Princeton are all ingesting books, writing code to view books, why do we all need different code to do this?
      • From that, it went to looking at data models, as it was causing a problem.
      • At this point, the Hydra-Works folks realized this work was close to what Islandora, Fedora was doing, so the question became how can we unify that?
      • Interoperability success here - from an admin perspective - is not to redo another interoperability vocabulary, but to have some wins with data and code sharing, and benefit among repo systems.
      • Stefano: Thinking here also about multi-tenant repositories, backends attached to multiple Hydra heads, Islandora, other. Agreement on data structure is important to ensure this can work.
      • Karen: it does veer the definition in some ways. We need to get straight what we want to achieve from this, this is important.
    • Esmé: There are a few options here, some are harder than others.
      • If Islandora and Hydra are both using Fedora, you want to be able to swap systems and get the same things.
      • Or if you’re describing the same things, you can serialize them and have the same graph.
      • We want to get close to this so tools can work from more than 1 data source, more than 1 application.
    • ?? : Basic data structures in PCDM is needed, so we can have agreement on a minimum common denominator for data structures shared across applications. To not get lost in details, this is a good starting point.
  • Mike: So we’re hearing: interoperability is agreement on data structures, but 2 perspectives:
      1. interop is different systems working with same stores
      1. data structure for api for ingest / export
  • Karen: A hesitation I have on some recent discussions is that we are trying to recreate ORE.
  • Trey: Hears 3 perspectives:
      1. Mike’s number 1, but Islandora is maybe going in diff arch system so possibly not reasonable;
      1. Can look at graph and understand what we’re going for, which is useful;
      1. If I build a book, and give you a book, you get a book;
    • Which of those pieces of interoperability are we interested in?
  • Nick: Number 2. (few others agree). Possibly the only way we can do this.
    • Mike: can you say more about that?
    • Nick: Back to named graphs. Swapping Fedora between Hydra/Islandora could be difficult
      • But publishing, consuming a graph is a possibility
    • Stefano?: different systems managing things in different ways, using different vocabs, but can comply to something core, that’s a goal. Not about necessarily publication itself, but understanding the graph
  • Trey: If we have a core model, but not anything about how to generate the structure. We don’t get further into interoperability with that?
    • Julie: PCDM doesn’t tell us its a book, just a structure.
    • Mike: That might be a further goal down the road
    • Trey: Don’t mean a book necessarily, but an object that has 6 things, and those things are represented somehow. PCDM should hit the mark where I can ingest
    • Jared: To get to that level of interoperability, we need to have a lot more discussions and taking down, tighten down specifications. So with a book object, for example, I know where/how I can go here to get page objects, etc.
      • Requires a lot more structure in the ontology.
      • Need more discussion on how to do it.
      • That’s a longer term goal.
      • It is possible that knowing a person’s profile, you could ingest their objects, and longer term we can find a way to unify our data model.
      • We don’t want to keep separating our data
  • Stefano: We’re closing to agreeing on some basic questions. Such as:
    • Recognizing across the board what is a RWO.
    • How should this RWO be treated? How should the digital content related to this RWO be treated? How is metadata treated? Etc.
    • What is the RWO, what is the digital representation, what is the asset, and where is the descriptive, technical, other metadata found?

###3. PCDM 2.0

  • Nick: So, with that, moving to 2.0, disagreements are Filesets, OWL versus RDFS?
    • OWL versus RDFS can be punted.
    • Stefano: Filesets need to be figured out, otherwise, we will have interoperability problems down the road
    • Trey: if we can’t agree to use a Fileset or not, that’s problematic. How would you go about it if you wouldn’t use a Fileset? How do we recognize the options?
  • Filesets Approach, Questions:
  • Jared: Has a use case for not using Filesets:
    • A lot of the objects would not have alternate representations in his repo.
    • He’d rather avoid that complexity until he needs it.
    • No metadata that goes onto that single fileset that cannot go onto the actual file.
    • Or, hasn't seen a good use case for this.
  • Mike: Are we thinking that this is going to be harder than it needs to be? We have a book model we like in Hydra world, use Hydra-Work extension, has Fileset, etc. Islandora doesn’t use Fileset, so we can create a mapping for model understanding? Code to understand each?
    • (Some general agreement)
    • Trey: But I’d like to have Fileset as optional in PCDM for the sake of his work.
    • Stefano?: Is a common understanding that we need something on Filesets to have a interoperable and cross-platform model?
      • Nick: Islandora is not against Fileset, it’s the mandatory or required nature of a Fileset.
    • Stefano: Is this an implementation problem or a modeling problem? Do Filesets make the modeling too complicated? Or make the implementation too complicated?
      • Jared: A little of both? But for his use case, its more practical - implementation, storage, etc. For him, its pragmatic.
      • Stefano: So this idea is flexible for later migrations?
    • From Esmé (IRC): https://docs.google.com/drawings/d/1D3mol7IU42q2oHzg8J1XTiQs9ccBU1XlG_P052gSi_c/edit
  • Filesets versus RWO/Intellectual Description:
  • ?: Filesets in this proposal, they are not just holder of files, but holder of metadata for the intellectual content? More about separation of entities, concepts than anything else?
    • Nick: Then what would be the purpose of the object?
    • ?: Anything not representing digital content, i.e. book. So the creator on a Book Object is the author, the creator on a Fileset is the digitizer, etc.
    • Nick: This is blurring the lines between descriptive and technical metadata.
    • Jared: And this would assume that the creator on the Fileset is the person created all the derivatives in the Fileset - this is not always true.
    • ??: A better example, perhaps, is a sculpture. A Fileset is containing different views
      • ??: Arguably, that’s the use case for Parts. The Fileset's purpose is to group files. Descriptive metadata should apply to content of both of those files.
      • Trey?: We used to use Filesets as the filler for an object, i.e. with books, put descriptive metadata on the fileset for the pages.
      • ??: So, in the case of a compound object, your photograph is an object with two Filesets, one for the front, one for the back? You would not want to do this?
      • Trey: No, we would not want to do this.
      • ??: So with a book, each page is its own object?
      • Trey: Yes. If you use Filesets to represent pages, this allows you to have a book Object linked to a PDF that represents the full book, as well as 6 pages with page-specific digital surrogates.
      • Stefano: So you are separating RWO from the digital surrogate
      • Simone?: Not comfortable with descriptive metadata on the fileset. RWO description happens at the object levels, and Filesets are information about the digital surrogate.
    • ??: Thinking of paintings:
      • Painting object has fileset that represents full painting.
      • Also has detail view, fileset has detail representations. This would be a part with a separate fileset?
    • Simone: Thinking of a website being stored. If you don’t have filesets connecting different files (HTML, CSS, etc.) that represent something.
  • Optional or Restrictions on Filesets:
  • Jared: Don’t have a problem with Filesets, but why can’t this be an optional construct?
    • Trey: potential agreement here is
      • x hasFileset y.
      • y has file z.
      • x hasFile z.
    • Keep filesets as buffer, but can also put files on objects.
      • Remove domain of hasFile / or at least, reconsider.
      • Keep hasFileset.
    • Jared: And, if you don’t have a Fileset, any file attached to that object is a full representation.
    • Adam?: So, back to Andrew’s point, requiring Fileset means you have to overbuild your system. But it does boil down to the question of if you use PCDM that way, we use PCDM this way, and we switch the back ends, could it operate the same? And we're thinking of a Fileset as a buffer?
      • Trey: yes, fileset is optional and clearly defined as a buffer between files
    • Nick: That (the buffer comment) answers the use of Fileset. Does it answer questions about whether or not filesets could standalone - seems so?
      • ?: Fileset is only place where you find an aggregation of files that have the same source. File might stand by itself. Fileset represents the act of digitizing content.
      • Trey: Work extension has a 1:n relationship for filesets. So the same cardinality restrictions for hasFileset as hasFile?
        • So, File can’t be in multiple objects, Fileset is same?
        • [missed comment]
        • Esmé: Fileset has to be a representation of the object it is contained by. (comment re: 1:n relationship of Fileset).
        • Stefano: Has as a use case for having n:n relationship for Filesets. He has digital assets that can be the representation of a Work, but can also be a representation of the Person in the painting.
        • Karen, others: Would this not be a relationship between the Work and the Person, and the Fileset remains the digital surrogate only of the Work?

###Moving Forward:

  • Fileset:
    • Should be in core, but have to determine nature of it, make it not mandatory.
    • Need to determine nature and definition of fileset, questions of membership (1:n versus n:n)
      • Continue to use ticket 57 for discussion of these questions.