Derwent World Patents Index on STN (Implementation)

 


Q: How is the data organised within the new DWPI? What are the Invention and Member Patent (publication) Levels ?

A: Two “levels” of data are now available within DWPI. The Invention Level is the traditional view of DWPI comprising bibliographic data, Thomson Scientific value-add title and abstracts, and general and, where appropriate, in-depth indexing. The Member Patent Level allows users to search and display bibliographic data and general indexing information associated with individual documents that make up the patent family Invention Level. This can allow very specific searching of individual documents. Additional data elements such as original titles and abstracts, claims, addresses and agent information are also present at the Member Patent Level.


Q: How do I search these Invention and Member Patent Levels ?

A: You can continue to search DWPI as you have always done if you are only interested in the traditional view of DWPI (the Invention Level). No additional search or display qualifiers are required.

Data elements which are common between the two levels, such as Thomson Scientific standardized patent assignees (PA), have the same search qualifiers. To restrict searches to the Member Patent Level field only for more focused searches requires the use of (L) proximity in combination with the document level qualifier /DLVL (Publication/DLVL).

Data elements which are unique to the Member Patent Level such as original patent assignees have been assigned new fields and so can be searched using the field qualifiers only (e.g. PAO).


Q: May I still use the Link (L) proximity operator ?

A: Yes, you may, but it may act in a slightly different manner than before. The Link proximity level has been used to organize the data into Invention and Member Patent Levels: one Link is associated with the invention or with any of the individual publications. It is used in combination with the document level qualifier /DLVL to restrict searches to the Member Patent Level only (Publication/DLVL).

The previous use to indicate that the search terms in the basic index belong to a common display field is no longer supported. For example :
=> s chain (L) sprocket
required ‘chain’ and ‘sprocket’ to reside in a common display field like AB or TI. In the reloaded file this is no longer the case, instead the operator gives the same results as a Boolean ‘AND’.


Q: Why are there two “Basic Indexes”; the standard Basic Index and the new Extended Basic Index ?

A: In order to retain the ability to easily search DWPI with great precision using the Thomson Scientific value-add titles and abstracts the new First Level data elements do NOT form part of the standard Basic Index, instead this data has been indexed separately in an Extended Basic Index.


Q: What fields make up the standard Basic Index and the new Extended Basic Index ?

A: In general the standard Basic Index comprises Thomson Scientific value-add fields and the Extended Basic Index comprises First Level data.

The standard Basic Index comprises:

  • Thomson Scientific value-add title, title terms, additional words and all abstract words (including Equivalent Abstracts, Extension Abstracts and Documentation Abstracts).

The Extended Basic Index comprises:

  • Original title, abstract and first claim (in English, German and French).


Q: The standard Basic Index of WPINDEX (non-subscriber file) and WPIDS (subscriber file) comprises the Thomson Scientific value-add Extension Abstracts and Documentation Abstracts. Does this mean that I can now display these abstracts as well even though I am not a Thomson Scientific subscriber ?

A: No, even though the text of these abstracts is now available for searching within WPINDEX and WPIDS the abstracts can still only be viewed by eligible subscribers within WPIX (subscriber file with Extension Abstracts). Please contact your nearest Thomson Scientific office if you wish to discuss becoming an eligible subscriber to this data.


Q: Can I search in German and French language ?

A: Yes, German and French language First Level data is available for some patent issuing authorities (original titles, abstracts and claims).


Q: What will I retrieve and display if I use the new abstract AB index/display field ?

A: The following data is indexed under /AB:
Thomson Scientific value-add abstracts available for the Basic and equivalents of the patent family plus any available original author abstracts.

The /AB index is the only index which contains a mixture of Thomson Scientific value-add and original First Level data.

However, the AB display qualifier will only display the Thomson Scientific value-add abstract which is available for the Basic of the patent family. This means that although equivalent and original abstracts are indexed under /AB they can NOT be displayed using AB.

The Thomson Scientific value-add abstract for the Basic and equivalents of the patent family can be displayed using the ABS display qualifier.

Thomson Scientific value-add Documentation and Extension Abstracts and the Technology Focus field are NOT indexed under /AB.


Q: I seem to receive more documents in the new file with identical queries. Why is that ?

A: There may be various reasons for why more documents are retrieved, for example

  1. Documentation Abstracts (ABDT) and Extension Abstracts (ABEX) have been added to the standard Basic Index in both WPINDEX and WPIDS.
  2. There is more application information in the file.
  3. Additional DCR references will yield more bibliographic documents after a structure search.


Q: I run scripts depending on update/entry dates. Will I encounter problems resulting from the changeover ?

A: While it has not been possible to preserve the entire history of updates in the new file, the updates in the old and reloaded files have been in sync since update 200610. Therefore scripts reaching back in time till 200610 should not be affected, yet you may recognize that the volumes of the updates are not identical for particular updates, since numbers of documents may vary according to different quality assessment processes applied at Thomson Scientific. For less frequently executed scripts it is recommended that you execute a final run shortly before the changeover.


Q: I need to see when a particular patent publication entered the DWPI database. Can I still use the UPP display for this purpose ?

A: Yes, in contrast to other update dates STN has assigned the historical update dates for patent publications to the individual publications in the reloaded database.


Q: What does it mean when you state there are no “stopwords” any longer’ ?

A: Improved searching technology now allows the indexing of all words whereas previously some very prevalent ones such as “from”, “is”, “which” etc. (stopwords) could not be indexed. Since all words are now indexed, precision and recall will be improved and some terms have been made searchable which were not accessible at all previously.


Q: I see that standard patent assignee codes (PACOs) now have a suffix ‘-C’ attached. Why is this ?

A: Standard PACOs now have a type ‘C’ assigned to provide a uniform patent assignee code format. The ‘C’ stands for ‘company’. The standard codes will still be searchable without the suffix though.


Q: Application data can no longer be linked to PD (publication date), PY (publication year) and ICM (main IPC1-7) with paragraph (P) proximity. How can I accomplish this in the new file ?

A: While PN (patent number) and PC (patent country) can be linked with paragraph (P) proximity to the application data, for PD, PY and PK a more elaborate search statement needs to be employed within the new Member Patent Level: use link (L) proximity and employ the publication document level qualifier for this purpose (Publication/DLVL).


Q: Using the (T) operator to link AP (application number) and AC (application country) with AD (application date) and AY (application year) is no longer available. How can that be accomplished in the new file ?

A: Use of paragraph (P) proximity is recommended for the same purpose.


Q: The DWPI format for some patent numbers seems to have changed. Where can I obtain a specification document ?

A: DWPI standard patent numbers have changed in a few cases. For example US applications are now 11-digits in length and PCT applications are now 10-digits in length with a four digit year irrespective of the publication date. A description is given in the fully revised DWPI on STN online user guide.


Q: In the index the pre-Y2K application numbers now seem to have a four digit year portion. Why has this been changed ?

A: Application and priority numbers are now indexed with a four digit year throughout for a more uniform appearance and to overcome sorting problems in the index. The same applies to AN (accession number), CR (cross-reference accession number), DNC (CPI secondary accession number) and DNN (non-CPI secondary accession number) . Old formats are searchable via a search field edit.


Q: Cross reference DWPI accession numbers do not carry the associated update (week) any longer. Why is that ?

A: It is no longer possible to populate this data due to changes in the Thomson Scientific production process.


Q: The ‘structured’ DCR numbers in the IT display field have shed their structure – why is that, and what are the implications for crossover between DCR and DWPI ?

A: Structured DCRs are no longer supplied for the bibliographic part of DWPI although they are still available in DCR. In the display in IT they have been replaced by a unique and unambiguous structure identifier, which is in the DCR segment in the AN.S display field, albeit the ‘DCR-‘ prefix has been omitted for simplicity. The crossover employing the /DCR feature is not affected by this change.


Q: In the chemical coding paragraphs there are MCN: entries listed – what are they ?

A: These Markush compound numbers have been taken out of the realm of DCNs (Specific Compound Numbers) and are now labelled MCNs. They are also now searchable in a separate search field in order to avoid confusion.


Q: Where can I obtain details of the XML document structure for use with the XML download format ?

A: Thomson Scientific will provide DTDs corresponding to distributed XML documents. Please contact your nearest Thomson Scientific office for further details.


Q: How are German umlauts and French accents represented in the new DWPI?

A: For online representation and indexing purposes ASCII transliterations have been provided by Thomson Scientific. If not national characters have been transliterated according to the rules generally applying to FIZ Karlsruhe files. However due to variations in the source data it is recommended that you expand the relevant index before searching in order to include all potential options.
For many sections both marked-up and ASCII versions of the data have been provided to FIZ Karlsruhe by Thomson Scientific. These marked-up sections contain national characters and symbols in Unicode encoded in UTF-8. This data will be made available in unadulterated form through an XML download option. Hence for post-processing purposes these characters are at your disposal in Unicode form if required.