A document describing why particular results are returned for searches and how to do certain types of searches is needed. This is just a list of some of the stuff.
RelevancyMatches more, but best hits at top
- search query as phrases (same words in same order) will be at the top (highest relevancy)
- more matching terms at top - if all terms match, that is more relevant than all but one terms matching, which is more relevant than all but two terms matching ...
- exact matches (w/o stemming) before stemmed matches
- mm -- "Minimum Must Match"
- if 4 terms or fewer, all must match; if 5 or more terms, 90% (rounded down) must match. So if there were 6 terms, 5 of them must match.
- for advanced search, all terms must match.
- "terms" does NOT include "stopwords" such as a, an, the, ... A phrase in a search (something surrounded by quotes) is a single term.
- Around the donut - 2 terms
- "Around the donut" - 1 term
"Slop" - the distance allowed between consecutive query terms
- query slop - affects whether or not the document is in the search results
- For a phrase in query (surrounded by quotes in query), this is the distance that can separate the query terms.
- our setting is 1.
- query: "french beans food scares" (with quotes) would match document containing "french beans make food scares" but would not match "french beans can make food scares"
- qs applies only when there is a phrase (in quotes) in the query
- phrase slop - affects how high the document is in a set of search results.
- like query slop, but it only affects the relevancy sorting of the matching documents.
- our setting is 0
- ps applies to ALL result sets
When data is either indexed or searched in Solr, a stemming algorithm is used to reduce any forms of a word to its root. The advantage is that more matches are retrieved.
Punctuation
Hyphens
Hyphens with no spaces around them work properly -- the following are equivalent:
- color-blind
- "color blind"
...these will both match colorblind as well, but the relevancy will favor "color blind". colorblind as a query gets fewer results than the above, because it only matches the single word variety. hyphens preceded by a space are treated as NOT:
- color -blind --> color NOT blind
- color - blind --> color NOT blind
Colons now work properly in searches -- the following are equivalent:
- Jazz : photographs
- Jazz: photographs
- Jazz photographs
Ampersands are treated as lowercase "and" -- the following are equivalent:
- dogs & cats
- dogs and cats
Special characters
Characters that perform specified functions in the catalog.
Curly Braces- Curly braces were used to specify exactly which indexed MARC tag to search or special field tag (catkey or URL). While this function still works in the staff WorkFlows interface, the relevancy ranking in SearchWorks supplants the user's need to perform specific MARC tag searching. A common use for the curly brace was for getting a specific record using the catkey (ckey).
- In Socrates you would search: 8571956{ckey}
- In SearchWorks, you can
- type the ckey into the search box and your result set will include the document with that ckey
- type the ckey into the browser's address bar: searchworks/view/8571956 or searchworks.stanford.edu/view/8571956
Truncation
- Searchworks uses Stemming
- Advanced Search: uses lucene request handler
Wildcards
- Searchworks uses Stemming
- Advanced Search: uses lucene request handler
Boolean Simple Search
hyphens preceded by a space are treated as NOT
- -bad --> NOT bad
- AND
- select "All" (insert image here)
- OR
- select "Any" (insert image)
ISBN, ISSN searches
just type 'em in
Periodicals, newspapers, and journals ... huh?
try using the format facet with a title search