Abstract
This paper describes a practical, specification first software implementation of the Akoma Ntoso (AKN) Media Type v1.0 as a parser within the Apache Tika content analysis toolkit. We further our intention of extending the OASIS AKN
committee specification (with the intention of lowering the barrier to entry) as a commonly identified IANA media type from which users, developers and publishers can benefit. Within the scope of this work we describe (i) the community driven development of the open source Akomantoso-lib parser
as a java class representation of the AKN XML schema, (ii) a software driven evolutionary argument as to why extended engagement, interoperability and use of software clients for the AKN legal document specification is an essential component within the advancement of legal informatics, and (iii) a detailed
description of the AKN parser and extraction functionality within Apache Tika; a metadata and content analysis toolkit. Tika, an open source project permissively licensed under the Apache License v2.0, currently has the ability to detect, parse
and extract metadata and data from over 1,400 IANA media types making it the digital babelfish of software content analysis toolkits available across the open source software spectrum. Our work to implement Tika detection, parse and
extraction wrappers for AKN presents a significant lowering of the barrier to entry for stakeholders across the AKN spectrum. Additionally this work also provides AKN consumers with a reliable, heavily supported, community-driven, flexible software implementation for continued use of the AKN
standard for the representation, manifestation and interpretation of legal documentation.
committee specification (with the intention of lowering the barrier to entry) as a commonly identified IANA media type from which users, developers and publishers can benefit. Within the scope of this work we describe (i) the community driven development of the open source Akomantoso-lib parser
as a java class representation of the AKN XML schema, (ii) a software driven evolutionary argument as to why extended engagement, interoperability and use of software clients for the AKN legal document specification is an essential component within the advancement of legal informatics, and (iii) a detailed
description of the AKN parser and extraction functionality within Apache Tika; a metadata and content analysis toolkit. Tika, an open source project permissively licensed under the Apache License v2.0, currently has the ability to detect, parse
and extract metadata and data from over 1,400 IANA media types making it the digital babelfish of software content analysis toolkits available across the open source software spectrum. Our work to implement Tika detection, parse and
extraction wrappers for AKN presents a significant lowering of the barrier to entry for stakeholders across the AKN spectrum. Additionally this work also provides AKN consumers with a reliable, heavily supported, community-driven, flexible software implementation for continued use of the AKN
standard for the representation, manifestation and interpretation of legal documentation.
Original language | English |
---|---|
Publication status | Published - 3 Aug 2015 |
Keywords
- web-services
- data integration
- legal informatics
- Akoma Ntoso
- Apache Tika
- metadata