I made a schema that included a field called "id" and a multi-valued field called "lookupids". I created a file that had a header row of "id<tab>lookupids", and data rows that had a guid followed by random ints separated by commas. ie,
269d8a33-0fd6-4877-b631-dccc4146cf90<tab>11507,25964,118430,306825,315793,348797,349191
The file contained 100000 entries, and I was able to index the file using a URL like this:
http://localhost:8983/solr/update/csv?commit=true&separator=%09&stream.file=exampledocs/test_with_lookupids.txt
One thing that I was expecting to happen was for the results to return the lookupids as an array. Instead the lookupids field values are returned the same way they were stored in the source file.
<result name="response" numFound="1" start="0">
<doc>
</doc>
<str name="id">
e09d8f38-c1ef-4a97-a832-a4bdc0b18bc5
</str>
<str name="lookupids">
2,16481,38485,50205,101885,107642,110903,142770,174184,193689,204770,223341,225669,242335,253654,278519,284132,333735,352163,372383,377816,401338,420851,443967,500899,575204,593052,645555,667294,742558,757738,804361,826200,828540,839016,859782,875115,877853,893658,915890,945398,954502,969859,971992,989172
</str>
<long name="_version_">
1419020904549056527
</long>The reason I was expecting the lookupids to be returned as an array is that the lookupids field was defined as follows:
<field name="lookupids" type="commaDelimited" indexed="true" stored="true" multivalued="true"/>
I figured that having the field defined as multivalued, and having the commaDelimited type set to use the PatternTokenizer with a pattern that separates using the comma to identify tokens, would give the array response.
I'll update this post once I figure out how to get the results as an array.
No comments:
Post a Comment