Solr and CakePHP Integration - Part II
Dan Voyce
Posted on August 1, 2019
This is the second in a series of posts about setting up CakePHP with Solr, if you missed part 1 you can catch it here: https://dev.to/locally/cakephp-and-solr-integration-417e
In this post, I will share my experience with the Solarium PHP Solr client library and how to perform some basic actions in Solr such as adding, removing and searching documents, which was essential for building the CakePHP Solr Behavior for work with the full-text search in Solr and keep the documents always updated.
We use Solr at LOCALLY to provide efficient Geofence and Geospatial processing for our Consumer Engagement platform LOCALLY Engage
Establishing a connection with Solr
Using the Solarium PHP Solr client library in CakePHP, this task is easy, you just need a few lines to make a connection with Solr.
First of all, installing the library:
composer require solarium/solarium
Then, creating the connection:
use Solarium\Client;
protected $client;
public function init()
{
$config = [
'endpoint' => [
'localhost' => [
'host' => "SOLR-HOST",
'port' => "SOLR-PORT",
'path' => "/solr/SOLR-CORE/",
],
],
];
$this->client = new Client($config);
}
Performing a search
Basically, to perform a search in Solr, you just need to use a few commands, there are many options you can set in your query, below I’m showing an example of how to use some of them using the library:
// create a client instance
$client = new Solarium\Client($config);
// get a select query instance
$query = $client->createQuery($client::QUERY_SELECT);
// set start and rows param (comparable to SQL limit) using fluent interface
$query->setStart(2)->setRows(20);
//It passes a term or "*:*" will search everything in solr
$query->setQuery($searchTerm);
// set fields to fetch (this overrides the default setting 'all fields')
$query->setFields(['id','name']);
// this executes the query and returns the result
$resultset = $client->execute($query);
If we create a function to perform a search and pass an array of options as a parameter, it should look like this:
public function search(array $options)
{
$query = $this->client->createSelect([
'start' => isset($options['start']) ? $options['start'] : $this->start,
'rows' => isset($options['limit']) ? $options['limit'] : $this->rows
]);
$query->getEDisMax();
$query->setQuery(!empty($options['match_field']) ? $options['match_field'] : '*:*');
$query->setFields('*');
$resultset = $this->client->select($query);
return $resultset->getData()['response'];
}
Extended DisMax (eDismax) Query Parser
This is an interesting option, the eDisMax is a better version of DisMax query parser, this allows you to write queries such as AND, OR, NOT, -, and +. Also respects all the “magic fields” names such as val and query. Find out more about it here.
Handling Solr Response
To handle a response of a search in solr, you can:
Interact Solr response object
// this executes the query and returns the result
$resultset = $client->execute($query);
// display the total number of documents found by solr
echo 'NumFound: '.$resultset->getNumFound();
// show documents using the resultset iterator
foreach ($resultset as $document) {
echo '<hr/><table>';
// the documents are also iterable, to get all fields
foreach ($document as $field => $value) {
// this converts multivalue fields to a comma-separated string
if (is_array($value)) {
$value = implode(', ', $value);
}
echo '<tr><th>' . $field . '</th><td>' . $value . '</td></tr>';
}
echo '</table>';
}
Return only the response:
...
$query = $this->client->createSelect();
...
$resultset = $this->client->select($query);
return $resultset->getData()['response'];
Debugging Solr Response
// create a client instance
$client = new Solarium\Client($config);
// get a select query instance
$query = $client->createSelect();
$query->setQuery('*');
// add debug settings
$debug = $query->getDebug();
$debug->setExplainOther('id:MA*');
// this executes the query and returns the result
$resultset = $client->select($query);
$debugResult = $resultset->getDebug();
// display the debug results
echo '<h1>Debug data</h1>';
echo 'Querystring: ' . $debugResult->getQueryString() . '<br/>';
echo 'Parsed query: ' . $debugResult->getParsedQuery() . '<br/>';
echo 'Query parser: ' . $debugResult->getQueryParser() . '<br/>';
echo 'Other query: ' . $debugResult->getOtherQuery() . '<br/>';
echo '<h2>Explain data</h2>';
foreach ($debugResult->getExplain() as $key => $explanation) {
echo '<h3>Document key: ' . $key . '</h3>';
echo 'Value: ' . $explanation->getValue() . '<br/>';
echo 'Match: ' . (($explanation->getMatch() == true) ? 'true' : 'false') . '<br/>';
echo 'Description: ' . $explanation->getDescription() . '<br/>';
echo '<h4>Details</h4>';
foreach ($explanation as $detail) {
echo 'Value: ' . $detail->getValue() . '<br/>';
echo 'Match: ' . (($detail->getMatch() == true) ? 'true' : 'false') . '<br/>';
echo 'Description: ' . $detail->getDescription() . '<br/>';
echo '<hr/>';
}
}
echo '<h2>ExplainOther data</h2>';
foreach ($debugResult->getExplainOther() as $key => $explanation) {
echo '<h3>Document key: ' . $key . '</h3>';
echo 'Value: ' . $explanation->getValue() . '<br/>';
echo 'Match: ' . (($explanation->getMatch() == true) ? 'true' : 'false') . '<br/>';
echo 'Description: ' . $explanation->getDescription() . '<br/>';
echo '<h4>Details</h4>';
foreach ($explanation as $detail) {
echo 'Value: ' . $detail->getValue() . '<br/>';
echo 'Match: ' . (($detail->getMatch() == true) ? 'true' : 'false') . '<br/>';
echo 'Description: ' . $detail->getDescription() . '<br/>';
echo '<hr/>';
}
}
echo '<h2>Timings (in ms)</h2>';
echo 'Total time: ' . $debugResult->getTiming()->getTime() . '<br/>';
echo '<h3>Phases</h3>';
foreach ($debugResult->getTiming()->getPhases() as $phaseName => $phaseData) {
echo '<h4>' . $phaseName . '</h4>';
foreach ($phaseData as $class => $time) {
echo $class . ': ' . $time . '<br/>';
}
echo '<hr/>';
}
Adding Documents to Solr
In order to keep the indexes up to date, you can add or update documents to Solr when a record is created or updated in your database, for example. Let’s create a function for it.
public function saveSearchIndex($data)
{
// get an update query instance
$update = $this->client->createUpdate();
// create a new document for the data,
// you can do validations before this line
$document = $update->createDocument();
//In my case, I am passing an array $data with the keys for my indexes.
foreach ($data as $key => $doc) {
$document->{$key} = $doc;
}
// add the document and a commit command to the update query
$update->addDocument($document);
$update->addCommit();
// this executes the query and returns the result
return $this->client->update($update);
}
Deleting Documents
In a very similar way, you can delete documents in Solr. There are two options for it, delete by id or delete by query. If you are using a database view with multiple tables as Solr source, as I did, delete by query is an option, as my id’s can “repeat” in different tables.
The function below, the conditions to delete an index will be something like in an “AND” condition, so I will delete indexes with the id $id and with a result_type $result_type:
public function deleteSearchIndex($id, $result_type)
{
$update = $this->client->createUpdate();
$update->addDeleteQuery('id:'.$id);
$update->addDeleteQuery('result_type:'.$result_type);
$update->addCommit();
return $this->client->update($update);
}
Same function deleting by id:
public function deleteSearchIndex($id)
{
$update = $this->client->createUpdate();
$update->addDeleteById($id);
$update->addCommit();
return $this->client->update($update);
}
That’s it! It is pretty simple work with Solr, once you’ve worked out its fundamentals!
In this blog we’ve seen how to:
- Create a connection with Solr using Solarium PHP Solr client library,
- Perform a search setting query, limits, fields and using eDisMax Query Parser.
- Debug a query response
- Handle a response
- Add and delete documents in Solr.
In the next blog, I will write about the CakePHP behavior to work with Solr and how to display results to the user.
|
Lorena Santana - Platform Developer System analyst/Web development in the last +8 years with different technologies and methodologies. Experience with ERP's, CMS and frameworks such as CakePHP, Bootstrap and Wordpress. Full stack developer, self-taught, expertise in database modeling and maintenence, also in requirements gathering. Experience with agile methodologies and work well under pressure. Experience with interns.
|
Posted on August 1, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.