Ticket #49 (new defect)

Opened 2 years ago

Last modified 2 years ago

duplicates in registry

Reported by: jkofmsk Owned by: jlsantoso
Priority: major Component: ARIADNE Registry
Version: Version 2.0 Keywords:
Cc:

Description

There seem to be duplicates in the registry: e.g openJorum & JorumOpen? Both have the same OAI-PMH target so they are the same. We need to think how to:

  • avoid duplicates
  • create better identifiers, maybe based on the targets itself.

Identifier: OpenJorum? Target 0: Entry: target-oai-pmh-OpenJorum? Catalog : ICOPER_targets Location :  http://open.jorum.ac.uk/oai/request Protocol Name : oai-pmh Protocol Description Binding Name Space :  http://www.openarchives.org/OAI/2.0/ Protocol Description Binding Location :  http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd Show all metadata Edit Metadata

Identifier: JorumOpen? Description: OCLC's OAICat Repository Framework Target 0: Entry: target-oai-pmh-JorumOpen? Catalog : ariadne-registry_targets Location :  http://open.jorum.ac.uk/oai/request Protocol Name : oai-pmh Protocol Description Binding Name Space :  http://www.openarchives.org/OAI/2.0/ Protocol Description Binding Location :  http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd

Change History

comment:1 Changed 2 years ago by jkofmsk

Same goes for instance for the

Identifier: ariadne_members

Identifier: AriadneMembersRepository?

From where do these duplicates originate?

comment:2 Changed 2 years ago by jlsantoso

In fact, you can see all the "duplicates" here:

 http://ariadne.cs.kuleuven.be/ariadne-registry/rss/ReportRepeatedTargets.jsp

The question is what criteria we can follow to define if one instance is repeated. I'm not sure wether we can say that they are repeated because they are same target. Currently, the catalog contains functionals and organizational aspects. From this point of view they are not repeated instances, because others need to have their own group of targets. At least, it was the criteria stablished in ICOPER. In this way, they can search by catalog and find all the targets that they want to harvest.

I'm not saying that it's the best way, it depends... if we have a clear objective how the users should use the registry, we can put some constraints, otherwise, we should control the information stored in the registry, for instance, via rss feeds or visualizations, and we can show how the users use the registry, at the end, it is the goal of WEB 2.0 applications, we don't care about how the users use one application, but we can analyze the information that we have and explain how they use it.

But it's only my point of view, of course, we can establish some kind of procedure to delete collections, sending emails and contacting people. But I don't have clear what is the real criteria to detect duplicated instances.

Note: See TracTickets for help on using tickets.