Note
You are viewing the documentation for an older version of boto (boto2).
Boto3, the next version of Boto, is now stable and recommended for general use. It can be used side-by-side with Boto in the same project, so it is easy to start using Boto3 in your existing projects as well as new projects. Going forward, API updates and all new feature work will be focused on Boto3.
For more information, see the documentation for boto3.
Cloudsearch¶
boto.cloudsearch.domain¶
-
class
boto.cloudsearch.domain.
Domain
(layer1, data)¶ A Cloudsearch domain.
Variables: - name – The name of the domain.
- id – The internally generated unique identifier for the domain.
- created – A boolean which is True if the domain is created. It can take several minutes to initialize a domain when CreateDomain is called. Newly created search domains are returned with a False value for Created until domain creation is complete
- deleted – A boolean which is True if the search domain has been deleted. The system must clean up resources dedicated to the search domain when delete is called. Newly deleted search domains are returned from list_domains with a True value for deleted for several minutes until resource cleanup is complete.
- processing – True if processing is being done to activate the current domain configuration.
- num_searchable_docs – The number of documents that have been submittted to the domain and indexed.
- requires_index_document – True if index_documents needs to be called to activate the current domain configuration.
- search_instance_count – The number of search instances that are available to process search requests.
- search_instance_type – The instance type that is being used to process search requests.
- search_partition_count – The number of partitions across which the search index is spread.
-
create_index_field
(field_name, field_type, default='', facet=False, result=False, searchable=False, source_attributes=[])¶ Defines an
IndexField
, either replacing an existing definition or creating a new one.Parameters: - field_name (string) – The name of a field in the search index.
- field_type (string) – The type of field. Valid values are uint | literal | text
- default (string or int) – The default value for the field. If the
field is of type
uint
this should be an integer value. Otherwise, it’s a string. - facet (bool) – A boolean to indicate whether facets
are enabled for this field or not. Does not apply to
fields of type
uint
. - results (bool) – A boolean to indicate whether values
of this field can be returned in search results or
used in ranking. Does not apply to fields of type
uint
. - searchable (bool) – A boolean to indicate whether search
is enabled for this field or not. Applies only to fields
of type
literal
. - source_attributes (list of dicts) –
An optional list of dicts that provide information about attributes for this index field. A maximum of 20 source attributes can be configured for each index field.
Each item in the list is a dict with the following keys:
- data_copy - The value is a dict with the following keys:
- default - Optional default value if the source attribute
- is not specified in a document.
- name - The name of the document source field to add
- to this
IndexField
.
- data_function - Identifies the transformation to apply
- when copying data from a source attribute.
- data_map - The value is a dict with the following keys:
- cases - A dict that translates source field values
- to custom values.
- default - An optional default value to use if the
- source attribute is not specified in a document.
- name - the name of the document source field to add
- to this
IndexField
- data_trim_title - Trims common title words from a source
- document attribute when populating an
IndexField
. This can be used to create anIndexField
you can use for sorting. The value is a dict with the following fields: * default - An optional default value. * language - an IETF RFC 4646 language code. * separator - The separator that follows the text to trim. * name - The name of the document source field to add.
Raises: BaseException, InternalException, LimitExceededException, InvalidTypeException, ResourceNotFoundException
-
create_rank_expression
(name, expression)¶ Create a new rank expression.
Parameters: - rank_name (string) – The name of an expression computed for ranking while processing a search request.
- rank_expression (string) –
The expression to evaluate for ranking or thresholding while processing a search request. The RankExpression syntax is based on JavaScript expressions and supports:
- Integer, floating point, hex and octal literals
- Shortcut evaluation of logical operators such that an
- expression a || b evaluates to the value a if a is true without evaluting b at all
- JavaScript order of precedence for operators
- Arithmetic operators: + - * / %
- Boolean operators (including the ternary operator)
- Bitwise operators
- Comparison operators
- Common mathematic functions: abs ceil erf exp floor
- lgamma ln log2 log10 max min sqrt pow
- Trigonometric library functions: acosh acos asinh asin
- atanh atan cosh cos sinh sin tanh tan
- Random generation of a number between 0 and 1: rand
- Current time in epoch: time
- The min max functions that operate on a variable argument list
Intermediate results are calculated as double precision floating point values. The final return value of a RankExpression is automatically converted from floating point to a 32-bit unsigned integer by rounding to the nearest integer, with a natural floor of 0 and a ceiling of max(uint32_t), 4294967295. Mathematical errors such as dividing by 0 will fail during evaluation and return a value of 0.
The source data for a RankExpression can be the name of an IndexField of type uint, another RankExpression or the reserved name text_relevance. The text_relevance source is defined to return an integer from 0 to 1000 (inclusive) to indicate how relevant a document is to the search request, taking into account repetition of search terms in the document and proximity of search terms to each other in each matching IndexField in the document.
For more information about using rank expressions to customize ranking, see the Amazon CloudSearch Developer Guide.
Raises: BaseException, InternalException, LimitExceededException, InvalidTypeException, ResourceNotFoundException
-
created
¶
-
delete
()¶ Delete this domain and all index data associated with it.
-
deleted
¶
-
doc_service_arn
¶
-
doc_service_endpoint
¶
-
get_access_policies
()¶ Return a
boto.cloudsearch.option.OptionStatus
object representing the currently defined access policies for the domain.
-
get_document_service
()¶
-
get_index_fields
(field_names=None)¶ Return a list of index fields defined for this domain.
-
get_rank_expressions
(rank_names=None)¶ Return a list of rank expressions defined for this domain.
-
get_search_service
()¶
-
get_stemming
()¶ Return a
boto.cloudsearch.option.OptionStatus
object representing the currently defined stemming options for the domain.
-
get_stopwords
()¶ Return a
boto.cloudsearch.option.OptionStatus
object representing the currently defined stopword options for the domain.
-
get_synonyms
()¶ Return a
boto.cloudsearch.option.OptionStatus
object representing the currently defined synonym options for the domain.
-
id
¶
-
index_documents
()¶ Tells the search domain to start indexing its documents using the latest text processing options and IndexFields. This operation must be invoked to make options whose OptionStatus has OptioState of RequiresIndexDocuments visible in search results.
-
name
¶
-
num_searchable_docs
¶
-
processing
¶
-
requires_index_documents
¶
-
search_instance_count
¶
-
search_partition_count
¶
-
search_service_arn
¶
-
search_service_endpoint
¶
-
update_from_data
(data)¶
-
boto.cloudsearch.domain.
handle_bool
(value)¶
boto.cloudsearch.exceptions¶
boto.cloudsearch.layer1¶
-
class
boto.cloudsearch.layer1.
Layer1
(aws_access_key_id=None, aws_secret_access_key=None, is_secure=True, host=None, port=None, proxy=None, proxy_port=None, proxy_user=None, proxy_pass=None, debug=0, https_connection_factory=None, region=None, path='/', api_version=None, security_token=None, validate_certs=True, profile_name=None)¶ -
APIVersion
= '2011-02-01'¶
-
DefaultRegionEndpoint
= 'cloudsearch.us-east-1.amazonaws.com'¶
-
DefaultRegionName
= 'us-east-1'¶
-
create_domain
(domain_name)¶ Create a new search domain.
Parameters: domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed. Raises: BaseException, InternalException, LimitExceededException
-
define_index_field
(domain_name, field_name, field_type, default='', facet=False, result=False, searchable=False, source_attributes=None)¶ Defines an
IndexField
, either replacing an existing definition or creating a new one.Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- field_name (string) – The name of a field in the search index.
- field_type (string) – The type of field. Valid values are uint | literal | text
- default (string or int) – The default value for the field. If the
field is of type
uint
this should be an integer value. Otherwise, it’s a string. - facet (bool) – A boolean to indicate whether facets
are enabled for this field or not. Does not apply to
fields of type
uint
. - results (bool) – A boolean to indicate whether values
of this field can be returned in search results or
used in ranking. Does not apply to fields of type
uint
. - searchable (bool) – A boolean to indicate whether search
is enabled for this field or not. Applies only to fields
of type
literal
. - source_attributes (list of dicts) –
An optional list of dicts that provide information about attributes for this index field. A maximum of 20 source attributes can be configured for each index field.
Each item in the list is a dict with the following keys:
- data_copy - The value is a dict with the following keys:
- default - Optional default value if the source attribute
- is not specified in a document.
- name - The name of the document source field to add
- to this
IndexField
.
- data_function - Identifies the transformation to apply
- when copying data from a source attribute.
- data_map - The value is a dict with the following keys:
- cases - A dict that translates source field values
- to custom values.
- default - An optional default value to use if the
- source attribute is not specified in a document.
- name - the name of the document source field to add
- to this
IndexField
- data_trim_title - Trims common title words from a source
- document attribute when populating an
IndexField
. This can be used to create anIndexField
you can use for sorting. The value is a dict with the following fields: * default - An optional default value. * language - an IETF RFC 4646 language code. * separator - The separator that follows the text to trim. * name - The name of the document source field to add.
Raises: BaseException, InternalException, LimitExceededException, InvalidTypeException, ResourceNotFoundException
-
define_rank_expression
(domain_name, rank_name, rank_expression)¶ Defines a RankExpression, either replacing an existing definition or creating a new one.
Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- rank_name (string) – The name of an expression computed for ranking while processing a search request.
- rank_expression (string) –
The expression to evaluate for ranking or thresholding while processing a search request. The RankExpression syntax is based on JavaScript expressions and supports:
- Integer, floating point, hex and octal literals
- Shortcut evaluation of logical operators such that an
- expression a || b evaluates to the value a if a is true without evaluting b at all
- JavaScript order of precedence for operators
- Arithmetic operators: + - * / %
- Boolean operators (including the ternary operator)
- Bitwise operators
- Comparison operators
- Common mathematic functions: abs ceil erf exp floor
- lgamma ln log2 log10 max min sqrt pow
- Trigonometric library functions: acosh acos asinh asin
- atanh atan cosh cos sinh sin tanh tan
- Random generation of a number between 0 and 1: rand
- Current time in epoch: time
- The min max functions that operate on a variable argument list
Intermediate results are calculated as double precision floating point values. The final return value of a RankExpression is automatically converted from floating point to a 32-bit unsigned integer by rounding to the nearest integer, with a natural floor of 0 and a ceiling of max(uint32_t), 4294967295. Mathematical errors such as dividing by 0 will fail during evaluation and return a value of 0.
The source data for a RankExpression can be the name of an IndexField of type uint, another RankExpression or the reserved name text_relevance. The text_relevance source is defined to return an integer from 0 to 1000 (inclusive) to indicate how relevant a document is to the search request, taking into account repetition of search terms in the document and proximity of search terms to each other in each matching IndexField in the document.
For more information about using rank expressions to customize ranking, see the Amazon CloudSearch Developer Guide.
Raises: BaseException, InternalException, LimitExceededException, InvalidTypeException, ResourceNotFoundException
-
delete_domain
(domain_name)¶ Delete a search domain.
Parameters: domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed. Raises: BaseException, InternalException
-
delete_index_field
(domain_name, field_name)¶ Deletes an existing
IndexField
from the search domain.Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- field_name (string) – A string that represents the name of an index field. Field names must begin with a letter and can contain the following characters: a-z (lowercase), 0-9, and _ (underscore). Uppercase letters and hyphens are not allowed. The names “body”, “docid”, and “text_relevance” are reserved and cannot be specified as field or rank expression names.
Raises: BaseException, InternalException, ResourceNotFoundException
-
delete_rank_expression
(domain_name, rank_name)¶ Deletes an existing
RankExpression
from the search domain.Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- rank_name (string) – Name of the
RankExpression
to delete.
Raises: BaseException, InternalException, ResourceNotFoundException
-
describe_default_search_field
(domain_name)¶ Describes options defining the default search field used by indexing for the search domain.
Parameters: domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed. Raises: BaseException, InternalException, ResourceNotFoundException
-
describe_domains
(domain_names=None)¶ Describes the domains (optionally limited to one or more domains by name) owned by this account.
Parameters: domain_names (list) – Limits the response to the specified domains. Raises: BaseException, InternalException
-
describe_index_fields
(domain_name, field_names=None)¶ Describes index fields in the search domain, optionally limited to a single
IndexField
.Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- field_names (list) – Limits the response to the specified fields.
Raises: BaseException, InternalException, ResourceNotFoundException
-
describe_rank_expressions
(domain_name, rank_names=None)¶ Describes RankExpressions in the search domain, optionally limited to a single expression.
Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- rank_names (list) – Limit response to the specified rank names.
Raises: BaseException, InternalException, ResourceNotFoundException
-
describe_service_access_policies
(domain_name)¶ Describes the resource-based policies controlling access to the services in this search domain.
Parameters: domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed. Raises: BaseException, InternalException, ResourceNotFoundException
-
describe_stemming_options
(domain_name)¶ Describes stemming options used by indexing for the search domain.
Parameters: domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed. Raises: BaseException, InternalException, ResourceNotFoundException
-
describe_stopword_options
(domain_name)¶ Describes stopword options used by indexing for the search domain.
Parameters: domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed. Raises: BaseException, InternalException, ResourceNotFoundException
-
describe_synonym_options
(domain_name)¶ Describes synonym options used by indexing for the search domain.
Parameters: domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed. Raises: BaseException, InternalException, ResourceNotFoundException
-
get_response
(doc_path, action, params, path='/', parent=None, verb='GET', list_marker=None)¶
-
index_documents
(domain_name)¶ Tells the search domain to start scanning its documents using the latest text processing options and
IndexFields
. This operation must be invoked to make visible in searches any options whose <a>OptionStatus</a> hasOptionState
ofRequiresIndexDocuments
.Parameters: domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed. Raises: BaseException, InternalException, ResourceNotFoundException
-
update_default_search_field
(domain_name, default_search_field)¶ Updates options defining the default search field used by indexing for the search domain.
Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- default_search_field (string) – The IndexField to use for search requests issued with the q parameter. The default is an empty string, which automatically searches all text fields.
Raises: BaseException, InternalException, InvalidTypeException, ResourceNotFoundException
-
update_service_access_policies
(domain_name, access_policies)¶ Updates the policies controlling access to the services in this search domain.
Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- access_policies (string) – An IAM access policy as described in The Access Policy Language in Using AWS Identity and Access Management. The maximum size of an access policy document is 100KB.
Raises: BaseException, InternalException, LimitExceededException, ResourceNotFoundException, InvalidTypeException
-
update_stemming_options
(domain_name, stems)¶ Updates stemming options used by indexing for the search domain.
Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- stems (string) – Maps terms to their stems. The JSON object has a single key called “stems” whose value is a dict mapping terms to their stems. The maximum size of a stemming document is 500KB. Example: {“stems”:{“people”: “person”, “walking”:”walk”}}
Raises: BaseException, InternalException, InvalidTypeException, LimitExceededException, ResourceNotFoundException
-
update_stopword_options
(domain_name, stopwords)¶ Updates stopword options used by indexing for the search domain.
Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- stopwords (string) – Lists stopwords in a JSON object. The object has a single key called “stopwords” whose value is an array of strings. The maximum size of a stopwords document is 10KB. Example: {“stopwords”: [“a”, “an”, “the”, “of”]}
Raises: BaseException, InternalException, InvalidTypeException, LimitExceededException, ResourceNotFoundException
-
update_synonym_options
(domain_name, synonyms)¶ Updates synonym options used by indexing for the search domain.
Parameters: - domain_name (string) – A string that represents the name of a domain. Domain names must be unique across the domains owned by an account within an AWS region. Domain names must start with a letter or number and can contain the following characters: a-z (lowercase), 0-9, and - (hyphen). Uppercase letters and underscores are not allowed.
- synonyms (string) – Maps terms to their synonyms. The JSON object has a single key “synonyms” whose value is a dict mapping terms to their synonyms. Each synonym is a simple string or an array of strings. The maximum size of a stopwords document is 100KB. Example: {“synonyms”: {“cat”: [“feline”, “kitten”], “puppy”: “dog”}}
Raises: BaseException, InternalException, InvalidTypeException, LimitExceededException, ResourceNotFoundException
-
-
boto.cloudsearch.layer1.
do_bool
(val)¶
boto.cloudsearch.layer2¶
-
class
boto.cloudsearch.layer2.
Layer2
(aws_access_key_id=None, aws_secret_access_key=None, is_secure=True, port=None, proxy=None, proxy_port=None, host=None, debug=0, session_token=None, region=None, validate_certs=True)¶ -
create_domain
(domain_name)¶ Create a new CloudSearch domain and return the corresponding
boto.cloudsearch.domain.Domain
object.
-
list_domains
(domain_names=None)¶ Return a list of
boto.cloudsearch.domain.Domain
objects for each domain defined in the current account.
-
lookup
(domain_name)¶ Lookup a single domain :param domain_name: The name of the domain to look up :type domain_name: str
Returns: Domain object, or None if the domain isn’t found Return type: boto.cloudsearch.domain.Domain
-
boto.cloudsearch.optionstatus¶
-
class
boto.cloudsearch.optionstatus.
IndexFieldStatus
(domain, data=None, refresh_fn=None, save_fn=None)¶ -
save
()¶ Write the current state of the local object back to the CloudSearch service.
-
-
class
boto.cloudsearch.optionstatus.
OptionStatus
(domain, data=None, refresh_fn=None, save_fn=None)¶ Presents a combination of status field (defined below) which are accessed as attributes and option values which are stored in the native Python dictionary. In this class, the option values are merged from a JSON object that is stored as the Option part of the object.
Variables: - domain_name – The name of the domain this option is associated with.
- create_date – A timestamp for when this option was created.
- state –
The state of processing a change to an option. Possible values:
- RequiresIndexDocuments: the option’s latest value will not be visible in searches until IndexDocuments has been called and indexing is complete.
- Processing: the option’s latest value is not yet visible in all searches but is in the process of being activated.
- Active: the option’s latest value is completely visible.
- update_date – A timestamp for when this option was updated.
- update_version – A unique integer that indicates when this option was last updated.
-
endElement
(name, value, connection)¶
-
refresh
(data=None)¶ Refresh the local state of the object. You can either pass new state data in as the parameter
data
or, if that parameter is omitted, the state data will be retrieved from CloudSearch.
-
save
()¶ Write the current state of the local object back to the CloudSearch service.
-
startElement
(name, attrs, connection)¶
-
to_json
()¶ Return the JSON representation of the options as a string.
-
wait_for_state
(state)¶ Performs polling of CloudSearch to wait for the
state
of this object to change to the provided state.
-
class
boto.cloudsearch.optionstatus.
RankExpressionStatus
(domain, data=None, refresh_fn=None, save_fn=None)¶
-
class
boto.cloudsearch.optionstatus.
ServicePoliciesStatus
(domain, data=None, refresh_fn=None, save_fn=None)¶ -
allow_doc_ip
(ip)¶ Add the provided ip address or CIDR block to the list of allowable address for the document service.
Parameters: ip (string) – An IP address or CIDR block you wish to grant access to.
-
allow_search_ip
(ip)¶ Add the provided ip address or CIDR block to the list of allowable address for the search service.
Parameters: ip (string) – An IP address or CIDR block you wish to grant access to.
-
disallow_doc_ip
(ip)¶ Remove the provided ip address or CIDR block from the list of allowable address for the document service.
Parameters: ip (string) – An IP address or CIDR block you wish to grant access to.
-
disallow_search_ip
(ip)¶ Remove the provided ip address or CIDR block from the list of allowable address for the search service.
Parameters: ip (string) – An IP address or CIDR block you wish to grant access to.
-
new_statement
(arn, ip)¶ Returns a new policy statement that will allow access to the service described by
arn
by the ip specified inip
.Parameters: - arn (string) – The Amazon Resource Notation identifier for the service you wish to provide access to. This would be either the search service or the document service.
- ip (string) – An IP address or CIDR block you wish to grant access to.
-
boto.cloudsearch.search¶
-
exception
boto.cloudsearch.search.
CommitMismatchError
¶
-
class
boto.cloudsearch.search.
Query
(q=None, bq=None, rank=None, return_fields=None, size=10, start=0, facet=None, facet_constraints=None, facet_sort=None, facet_top_n=None, t=None)¶ -
RESULTS_PER_PAGE
= 500¶
-
to_params
()¶ Transform search parameters from instance properties to a dictionary
Return type: dict Returns: search parameters
-
update_size
(new_size)¶
-
-
class
boto.cloudsearch.search.
SearchConnection
(domain=None, endpoint=None)¶ -
build_query
(q=None, bq=None, rank=None, return_fields=None, size=10, start=0, facet=None, facet_constraints=None, facet_sort=None, facet_top_n=None, t=None)¶
-
get_all_hits
(query)¶ Get a generator to iterate over all search results
Transparently handles the results paging from Cloudsearch search results so even if you have many thousands of results you can iterate over all results in a reasonably efficient manner.
Parameters: query ( boto.cloudsearch.search.Query
) – A group of search criteriaReturn type: generator Returns: All docs matching query
-
get_all_paged
(query, per_page)¶ Get a generator to iterate over all pages of search results
Parameters: - query (
boto.cloudsearch.search.Query
) – A group of search criteria - per_page (int) – Number of docs in each
boto.cloudsearch.search.SearchResults
object.
Return type: generator
Returns: Generator containing
boto.cloudsearch.search.SearchResults
- query (
-
get_num_hits
(query)¶ Return the total number of hits for query
Parameters: query ( boto.cloudsearch.search.Query
) – a group of search criteriaReturn type: int Returns: Total number of hits for query
-
search
(q=None, bq=None, rank=None, return_fields=None, size=10, start=0, facet=None, facet_constraints=None, facet_sort=None, facet_top_n=None, t=None)¶ Send a query to CloudSearch
Each search query should use at least the q or bq argument to specify the search parameter. The other options are used to specify the criteria of the search.
Parameters: - q (string) – A string to search the default search fields for.
- bq (string) – A string to perform a Boolean search. This can be used to create advanced searches.
- rank (List of strings) – A list of fields or rank expressions used to order the
search results. A field can be reversed by using the - operator.
['-year', 'author']
- return_fields (List of strings) – A list of fields which should be returned by the
search. If this field is not specified, only IDs will be returned.
['headline']
- size (int) – Number of search results to specify
- start (int) – Offset of the first search result to return (can be used for paging)
- facet (list) – List of fields for which facets should be returned
['colour', 'size']
- facet_constraints (dict) – Use to limit facets to specific values
specified as comma-delimited strings in a Dictionary of facets
{'colour': "'blue','white','red'", 'size': "big"}
- facet_sort (dict) – Rules used to specify the order in which facet
values should be returned. Allowed values are alpha, count,
max, sum. Use alpha to sort alphabetical, and count to sort
the facet by number of available result.
{'color': 'alpha', 'size': 'count'}
- facet_top_n (dict) – Dictionary of facets and number of facets to
return.
{'colour': 2}
- t (dict) – Specify ranges for specific fields
{'year': '2000..2005'}
Return type: Returns: Returns the results of this search
The following examples all assume we have indexed a set of documents with fields: author, date, headline
A simple search will look for documents whose default text search fields will contain the search word exactly:
>>> search(q='Tim') # Return documents with the word Tim in them (but not Timothy)
A simple search with more keywords will return documents whose default text search fields contain the search strings together or separately.
>>> search(q='Tim apple') # Will match "tim" and "apple"
More complex searches require the boolean search operator.
Wildcard searches can be used to search for any words that start with the search string.
>>> search(bq="'Tim*'") # Return documents with words like Tim or Timothy)
Search terms can also be combined. Allowed operators are “and”, “or”, “not”, “field”, “optional”, “token”, “phrase”, or “filter”
>>> search(bq="(and 'Tim' (field author 'John Smith'))")
Facets allow you to show classification information about the search results. For example, you can retrieve the authors who have written about Tim:
>>> search(q='Tim', facet=['Author'])
With facet_constraints, facet_top_n and facet_sort more complicated constraints can be specified such as returning the top author out of John Smith and Mark Smith who have a document with the word Tim in it.
>>> search(q='Tim', ... facet=['Author'], ... facet_constraints={'author': "'John Smith','Mark Smith'"}, ... facet=['author'], ... facet_top_n={'author': 1}, ... facet_sort={'author': 'count'})
-
-
class
boto.cloudsearch.search.
SearchResults
(**attrs)¶ -
next_page
()¶ Call Cloudsearch to get the next page of search results
Return type: boto.cloudsearch.search.SearchResults
Returns: the following page of search results
-
-
exception
boto.cloudsearch.search.
SearchServiceException
¶
boto.cloudsearch.document¶
-
exception
boto.cloudsearch.document.
CommitMismatchError
¶
-
class
boto.cloudsearch.document.
CommitResponse
(response, doc_service, sdf)¶ Wrapper for response to Cloudsearch document batch commit.
Parameters: - response (
requests.models.Response
) – Response from Cloudsearch /documents/batch API - doc_service (
boto.cloudsearch.document.DocumentServiceConnection
) – Object containing the documents posted and methods to retry
Raises: Raises: Raises: Raises: - response (
-
exception
boto.cloudsearch.document.
ContentTooLongError
¶ Content sent for Cloud Search indexing was too long
This will usually happen when documents queued for indexing add up to more than the limit allowed per upload batch (5MB)
-
class
boto.cloudsearch.document.
DocumentServiceConnection
(domain=None, endpoint=None)¶ A CloudSearch document service.
The DocumentServiceConection is used to add, remove and update documents in CloudSearch. Commands are uploaded to CloudSearch in SDF (Search Document Format).
To generate an appropriate SDF, use
add()
to add or update documents, as well asdelete()
to remove documents.Once the set of documents is ready to be index, use
commit()
to send the commands to CloudSearch.If there are a lot of documents to index, it may be preferable to split the generation of SDF data and the actual uploading into CloudSearch. Retrieve the current SDF with
get_sdf()
. If this file is the uploaded into S3, it can be retrieved back afterwards for upload into CloudSearch usingadd_sdf_from_s3()
.The SDF is not cleared after a
commit()
. If you wish to continue using the DocumentServiceConnection for another batch upload of commands, you will need toclear_sdf()
first to stop the previous batch of commands from being uploaded again.-
add
(_id, version, fields, lang='en')¶ Add a document to be processed by the DocumentService
The document will not actually be added until
commit()
is calledParameters: - _id (string) – A unique ID used to refer to this document.
- version (int) – Version of the document being indexed. If a file is being reindexed, the version should be higher than the existing one in CloudSearch.
- fields (dict) – A dictionary of key-value pairs to be uploaded .
- lang (string) – The language code the data is in. Only ‘en’ is currently supported
-
add_sdf_from_s3
(key_obj)¶ Load an SDF from S3
Using this method will result in documents added through
add()
anddelete()
being ignored.Parameters: key_obj ( boto.s3.key.Key
) – An S3 key which contains an SDF
-
clear_sdf
()¶ Clear the working documents from this DocumentServiceConnection
This should be used after
commit()
if the connection will be reused for another set of documents.
-
commit
()¶ Actually send an SDF to CloudSearch for processing
If an SDF file has been explicitly loaded it will be used. Otherwise, documents added through
add()
anddelete()
will be used.Return type: CommitResponse
Returns: A summary of documents added and deleted
-
delete
(_id, version)¶ Schedule a document to be removed from the CloudSearch service
The document will not actually be scheduled for removal until
commit()
is calledParameters: - _id (string) – The unique ID of this document.
- version (int) – Version of the document to remove. The delete will only occur if this version number is higher than the version currently in the index.
-
get_sdf
()¶ Generate the working set of documents in Search Data Format (SDF)
Return type: string Returns: JSON-formatted string of the documents in SDF
-
-
exception
boto.cloudsearch.document.
EncodingError
¶ Content sent for Cloud Search indexing was incorrectly encoded.
This usually happens when a document is marked as unicode but non-unicode characters are present.
-
exception
boto.cloudsearch.document.
SearchServiceException
¶