filetype: Return specific filetypes, filetype:PDF . Use the “filetype:” operator, followed by the type of file you are looking for (e.g. “PDF”). REST APIs, through their use of self-descriptive messages and hypermedia as the allintitle: – Only the page titles are searched[29] (not the remaining text show files of the desired type (ex filetype:pdf will return pdf files). This is particularly true of information about file-type although library-oriented rules for cataloguing to draw in some cases, for example, hypermedia or texts with built in search and retrieval software. all in text not tag.

Author: Kelkis Fek
Country: Greece
Language: English (Spanish)
Genre: Career
Published (Last): 14 July 2013
Pages: 461
PDF File Size: 8.91 Mb
ePub File Size: 17.59 Mb
ISBN: 142-3-17761-412-7
Downloads: 5654
Price: Free* [*Free Regsitration Required]
Uploader: Tokazahn

This chapter addresses the problems of describing an encoded allihtext so that the text itself, its source, its encoding, and its revisions are all thoroughly documented.

Such documentation is equally necessary for scholars using the texts, for software processing them, and for cataloguers in libraries and archives. Together these descriptions and declarations provide an electronic analogue to the title page attached to a printed work. They also constitute an equivalent for the content of the code books or introductory manuals customarily accompanying electronic data sets.

A TEI header can be a very large and complex object, or it may be a very simple one. Some application areas for example, the construction of language corpora and the transcription of spoken texts may require more specialized and detailed information than others. The present proposals therefore define both a core set of elements all of which may be used without formality in any TEI header and some additional elements which become available within the header as the result of including additional specialized fkletype within the schema.

When the module for language corpora described in chapter 15 Language Corpora is in use, for example, several additional elements are available, as further detailed in that chapter. The next section of the present chapter briefly introduces the overall structure of the header and the kinds of data it may contain.

This is followed by a detailed description of all the constituent elements which may be used in the core header. The teiHeader element should be clearly distinguished from the front matter of the text itself for which see section 4. A composite text, such as a corpus or collection, may contain several headers, as further discussed below.

In the usual case, however, a TEI-conformant text will contain a single teiHeader element, followed by a single text element. The TEI Header provides a very rich collection of metadata categories, but makes no claim to be exhaustive.

It is certainly the case that individual projects may wish to record specialised metadata which either does not fit within one of the predefined categories identified by the TEI Header or requires a more specialized element structure than is proposed here.

To overcome this problem, the encoder may elect to define additional elements using the customization methods discussed in The TEI class system makes such customizations simpler to effect and easier to use in interchange. This section describes the fileDesc element, which is the first component of the teiHeader element. The bibliographic description of a machine-readable or digital text resembles in structure that of a book, an article, or any other kind of textual object.

The file description element of the TEI header has therefore been closely modelled on existing standards in library cataloguing; it should thus provide enough information to allow users to give standard bibliographic references to the electronic text, and to allow cataloguers to catalogue it. Bibliographic citations occurring elsewhere in the header, and also in the text itself, are derived from the same model on bibliographic citations in general, see further section 3.

See further section 2.

P5: Guidelines for Electronic Text Encoding and Interchange

The title element contains the chief name of the electronic work, including any alternative title or subtitles it may have. It may be repeated, if the work has more than one title perhaps in different languages and takes whatever form is considered appropriate by its creator. This alointext distinguish the electronic work from the source text in citations and in catalogues which contain descriptions of both types of material.

This name is likely to change frequently, as new copies of the file are made on the computer system. Its form is entirely dependent on the particular computer system in use and thus cannot always easily be transferred from one system to another. Moreover, a given work may be composed of many files. For these reasons, these Guidelines strongly recommend that such names should not be used filegype the title for any electronic work. Helpful guidance on the formulation of useful descriptive titles in difficult cases may be found in the Anglo-American Cataloguing Rules Gorman and Winkler,chapter 25 or in equivalent national-level bibliographical documentation.


The elements authorsponsorfunderand principalare specializations of the more general respStmt element. These elements are used to provide the statements of responsibility which identify the person s responsible for the intellectual or artistic content of an item and any corporate bodies from which it emanates. Any number of such statements may occur within the title statement.

At a minimum, identify the filteype of the text and where appropriate the creator of the file. If the bibliographic description is for a corpus, identify the creator of the corpus. Optionally include also names of others involved in the transcription or elaboration of the text, sponsors, and funding agencies.

The name of the person responsible for physical data input need not normally be recorded, unless that person is also intellectually responsible for some aspect of the creation of the file. Where the person whose responsibility is to be documented is not an author, sponsor, funding body, or principal researcher, the respStmt element should be used. This has two subcomponents: No specific recommendations are made at this time as to appropriate content for the resp: Names given hgpermedia be personal names or corporate names.

Give all names in the form in which the persons or bodies wish to be publicly cited. This would usually be the fullest form of the name, including first names.

For printed texts, the word edition applies to the set of all the identical copies of an item produced from one master copy and issued by a particular publishing agency or a group of such agencies. A change in the identity of the distributing body or bodies does not normally constitute a change filetyps edition, while a change in the master copy does.

Synonymous terms used in these Guidelines are versionleveland release. The words revision and updateby contrast, are used for minor changes to a file which do not amount to a new edition.

Allinext general principle proposed here is that the production of a new edition entails a significant change in the intellectual content hypermefia the file, rather than its encoding or appearance.

TEI P5: Guidelines for Electronic Text Encoding and Interchange: 2 The TEI Header

The addition of analytic coding to a text would thus constitute a new edition, while automatic conversion from one coded representation to another would not. Changes relating to the character code or physical storage details, corrections of misspellings, simple changes in the arrangement of the contents and changes in the output format do not normally constitute a new edition, whereas the addition of new information e. Clearly, there will always be borderline cases and the matter is somewhat arbitrary.

The simplest rule is: An edition statement is optional for the first release allontext a computer file; it is mandatory for each later release, though this requirement cannot be enforced by the parser. Note that all changes in a file, whether or not they are regarded as constituting a new edition or simply a new revision, should be independently noted in the revision description section of the file header see section 2. The edition element should contain phrases describing the edition or version, including the word editionversionor equivalent, together with a number or date, or terms indicating difference from other editions such as new editionrevised edition hypdrmedia.

Any dates that occur within the edition statement should be marked with the date element. The n attribute of the edition element may be used as elsewhere to supply any formal hypermfdia such as a version number for the edition. One or more respStmt elements may also be used to supply statements of responsibility for the edition in question. These may refer to aallintext or corporate bodies and can indicate functions such as that of a reviser, or can name the person or body responsible for the provision of supplementary matter, of appendices, etc.

For further detail on the respStmt element, see section 3. For printed books, information about the carrier, such as the kind of medium used and its size, are of great importance in cataloguing procedures. The print-oriented rules for bibliographic description of an item’s medium and extent need some re-interpretation when applied to electronic media. An electronic file exists as a distinct entity quite independently of its carrier and remains the same intellectual object whether it is stored on a magnetic tape, a CD-ROM, a set of floppy disks, or as a file on a mainframe computer.


Since, moreover, these Guidelines are specifically aimed at facilitating transparent document storage and interchange, any purely machine-dependent information should be irrelevant as far as the file header is concerned. This is particularly true of information about file-type although library-oriented rules for cataloguing often distinguish two types of computer file: This distinction is quite difficult to draw in some cases, for example, hypermedia or texts with built in search and retrieval software.

The use of standard abbreviations for units of quantity is recommended where applicable, here as elsewhere see hyperjedia The publisher is the person or institution by whose authority a given edition of the file is made public.

Computer Science

The distributor is the person or institution from whom copies of the text may be obtained. Where a text is not considered formally published, but is nevertheless made available for circulation by some individual or organization, this person or institution is termed the release authority. Note that the dates, places, etc. If the text was created at some date other than its date of publication, its date of creation should be given within the profileDesc element, not in the publication statement.

Give any other useful dates e. Additional detailed elements may be used for the encoding of names, dates, and addresses, as further described in section 3. The idno may be used to supply any identifying number associated with the item, including both standard numbers such as an ISSN and particular issue numbers. Arabic numerals separated by punctuation are recommended for this purpose: A sampling declaration which applies to more than one text or division of a text need not be repeated in the header of each such text.

Instead, the decls attribute of each text or subdivision of the text to which the sampling declaration applies may be used to supply a cross reference to it, as further hypermrdia in section Was the text corrected during or after data capture? If so, were corrections made silently or are they marked using the tags described in section 3. What principles have been adopted with respect to omissions, truncations, dubious corrections, alternate readings, false starts, repetitions, etc.?

Was the text normalized, for example by regularizing hyermedia non-standard spellings, dialect forms, etc.? If so, were normalizations performed silently or are they marked using the tags described in allimtext 3. What authority filerype used for the regularization?

Also, what principles were used when normalizing ifletype to provide the standard values for the value attribute described in section 3.

How were quotation marks processed? Are apostrophes allintexh quotation marks distinguished? Are quotation marks retained as content in the text or replaced by markup? Are there any special conventions regarding for example the use of single or double quotation marks when hypernedia Is the file consistent in its practice or has this not been checked?

Allintest principle has been adopted with respect to end-of-line hyphenation where source lineation has not been retained? Have soft hyphens been silently removed, and if so what is the fiketype on lineation and pagination?

How is the text segmented? If s or seg segmentation units have been used to divide up the text for analysis, how are they marked and how was the segmentation arrived at? In most cases, attributes bearing standardized values such as the when or when-iso attribute on dates should conform to a defined W3C or ISO datatype. In cases where this is not appropriate, this element may be used to describe the standardization methods underlying the values supplied.

If so, how was it generated? How was it encoded? If feature-structure analysis has been used, are fsdDecl elements section