Ticket #49 (new defect)
duplicates in registry
| Reported by: | jkofmsk | Owned by: | jlsantoso |
|---|---|---|---|
| Priority: | major | Component: | ARIADNE Registry |
| Version: | Version 2.0 | Keywords: | |
| Cc: |
Description
There seem to be duplicates in the registry: e.g openJorum & JorumOpen? Both have the same OAI-PMH target so they are the same. We need to think how to:
- avoid duplicates
- create better identifiers, maybe based on the targets itself.
Identifier: OpenJorum? Target 0: Entry: target-oai-pmh-OpenJorum? Catalog : ICOPER_targets Location : http://open.jorum.ac.uk/oai/request Protocol Name : oai-pmh Protocol Description Binding Name Space : http://www.openarchives.org/OAI/2.0/ Protocol Description Binding Location : http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd Show all metadata Edit Metadata
Identifier: JorumOpen? Description: OCLC's OAICat Repository Framework Target 0: Entry: target-oai-pmh-JorumOpen? Catalog : ariadne-registry_targets Location : http://open.jorum.ac.uk/oai/request Protocol Name : oai-pmh Protocol Description Binding Name Space : http://www.openarchives.org/OAI/2.0/ Protocol Description Binding Location : http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd
Change History
comment:2 Changed 2 years ago by jlsantoso
In fact, you can see all the "duplicates" here:
http://ariadne.cs.kuleuven.be/ariadne-registry/rss/ReportRepeatedTargets.jsp
The question is what criteria we can follow to define if one instance is repeated. I'm not sure wether we can say that they are repeated because they are same target. Currently, the catalog contains functionals and organizational aspects. From this point of view they are not repeated instances, because others need to have their own group of targets. At least, it was the criteria stablished in ICOPER. In this way, they can search by catalog and find all the targets that they want to harvest.
I'm not saying that it's the best way, it depends... if we have a clear objective how the users should use the registry, we can put some constraints, otherwise, we should control the information stored in the registry, for instance, via rss feeds or visualizations, and we can show how the users use the registry, at the end, it is the goal of WEB 2.0 applications, we don't care about how the users use one application, but we can analyze the information that we have and explain how they use it.
But it's only my point of view, of course, we can establish some kind of procedure to delete collections, sending emails and contacting people. But I don't have clear what is the real criteria to detect duplicated instances.

Same goes for instance for the
Identifier: ariadne_members
Identifier: AriadneMembersRepository?
From where do these duplicates originate?