Hi Diane,
The 'schema version' of your data gets set when you first call
createSchema, based on the version of geomesa you are using. When we
modify our internal schema (for example, when we replaced the
geohash index with a z-index), we bump the internal schema version.
However, in order to maintain back compatibility, we keep all the
e.g. geohash code in our codebase, and it still gets used for an
older data set.
We haven't been explicit about when the schema version changes, but
in general if we mention new indices or index improvements in our
release notes, we've probably bumped the schema version. You can see
the version of your data by looking in your simple feature types
user data under the key 'geomesa.version', or by examining the
catalog table in the accumulo shell for a similar entry.
The current version is stored here, so you can always check it when
we have a new release:
https://github.com/locationtech/geomesa/blob/master/geomesa-utils/src/main/scala/org/locationtech/geomesa/geomesa.scala#L22
When we release a new schema version, it is up to you if the
benefits of using it outweigh the cost of migrating data to the new
format. In order to get the new index, we essentially have to
re-write all your data. We have a map/reduce job to do this, as
outlined in the docs, but for large data sets that can still be
onerous. Even if you don't migrate your data, upgrading will often
fix bugs and have other query improvements that don't require a
change in how the data is stored.
Hope that clears things up. Let us know if you have any other
questions.
Thanks,
Emilio
On 07/02/2016 05:57 AM, Diane Griffith
wrote:
Emilio,
So upgrading to 1.2.3 seems to have fixed the feature count
issue. When I attempted to do the patch to 1.2.2 with I believe
setting the system property for GeoServer, it did not fix the
count issue. That is why I then upgraded to geomesa 1.2.3 once
it was available. Just wanted to follow up with that.
Also though you say I do not need to run the stats job
manually. I was reading the install documentation and it talks
of how you all improve indexes all the time. When we upgrade
should we be doing something to help leverage the new index
logic as we upgrade? So we ail have data ingesting for existing
tables/feature types continually to bring in new data daily. It
is not clear if we need to incorporate something around the
indexes to the upgrade process.
Thanks,
Diane
1.2.3 should be released
tonight, I believe that the release job is running right
now. If you'd like to apply the fix to 1.2.2, this is the
commit:
https://github.com/locationtech/geomesa/commit/d4955169583faabc70fd53c95b0e1ce7d6d84777
I believe it will apply without any conflicts on 1.2.2.
Either with the patch or the new version, you will need to
set the system property hint to get an exact count. In 1.2.3
we've optimized things a bit so that it should run faster
for exact counts, and also return a better estimate for
non-exact counts.
Thanks,
Emilio
On 06/23/2016 05:59 PM, Diane
Griffith wrote:
So I am using WFS version 2.0.0 for my requests I
send to pull the feature data.
The numberMatched is when I do &resultType=hits
in the WFS url, it gives back xml versus json I think
regardless of what I set as the outputFormat. I only
use that when I just want to see the count and not the
actual features for debugging and validation testing.
When I use &outputFormat=json without
&resultType=hits in the WFS url I get a
totalFeatures field in the json response.
Regardless of the field name the value is consistent
for WFS version 2.0.0:
GeoMesa 1.2.2 returns total number of features in the
feature type specified so that means the total is not
filtered at all by the cql_filter field passed (that
contains a bbox and additional cql)
GeoMesa 1.2.0 does return the number of features that
match the filter I provide in the cql_filter field
(containing bbox and additional cql) for the feature
type specified.
What class sets the totalFeatures return value. I
hadn't found that class yet to try and see if there was
a patch I could try to apply myself to fix this
problem.
That or is there a patch I could apply for that
system property if it does work?
We had hoped to allow users to beta test on our more
operational system but that is set to geomesa 1.2.2 and
this bug is more of a blocker to allow users in to test
with.
Is there a target timeframe for the 1.2.3 release?
Thanks,
Diane
Diane, the WFS
JSON testing I was doing didn't have a
'numberMatched' field - instead it has a
'totalFeatures' field. I assume they would be mapped
to the same thing, but I couldn't figure out how to
get that format back. I just clicked on the GML
layer preview, and then added outputFormat=json to
the URL.
Thanks,
Emilio
On 06/23/2016 03:32 PM,
Emilio Lahr-Vivaz wrote:
Oops, I verified that the system
property is not getting handled correctly. I'll
open a bug for it, and we'll get it into 1.2.3.
Thanks,
Emilio
On 06/23/2016 03:14
PM, Emilio Lahr-Vivaz wrote:
What version of WFS are you using
for the request? Yes, the system property should
be set in the geoserver startup scripts, like:
-Dgeomesa.force.count=true
On 06/23/2016 02:44
PM, Diane Griffith wrote:
For the system
property route, do you mean set it on
geoserver? If so then that did not
work.
For the second
option I’m not directly querying the
accumulo stack, I’m doing it via
geoserver so I do not think I can add
that hint.
What I do know is
1.2.0 dev system returns the correct
number of matches with the WFS
response. The 1.2.2 based system is
returning the total number of features
regardless of the actual query. I
didn’t upgrade the 1.2.0 dev system but
I bet as soon as I do it no longer
returns a correct count. When I say a
different number the WFS request matched
3 records/features.
On the 1.2.0 based
system for the same data I had
numberMatched=”3” but for the 1.2.2
based system it had
numberMatched=”1508503” (for
resultType=hits)
Same data set,
same query:
CQL_FILTER=(BBOX(pickupLocation,
-73.82473468780518,40.82369327545166,-73.81924152374268,40.828328132629395,
'EPSG:4326') AND (Total_amount >=1
AND Total_amount <=5))
With
outputFormat=json:
1.2.0
gave field "totalFeatures":
3
1.2.2 gave field
“totalFeatures”: 1508503
Diane
Hi Diane,
Most likely it is due to getCount being
called on the feature collection being
returned. Since GeoMesa is streaming data
from accumulo, in order to calculate an
exact count we essentially have to run the
query twice. Because of this, we return a
fake count, or in the upcoming 1.2.3 an
estimated count.
Since sometimes the exact count is
important, we have two ways to force the
counting:
1. Set the system property
"geomesa.force.count" to "true" to force
exact counting for all queries
2. For individual queries, set a query
hint using the key
org.locationtech.geomesa.accumulo.index.QueryHints.EXACT_COUNT
and the value of Boolean.TRUE
I believe this behavior was the same in
1.2.0 though...
Could you try one of the above and see if
it resolves your problem?
Thanks,
Emilio
On 06/23/2016 01:10
PM, Diane Griffith wrote:
We had upgraded to
GeoMesa 1.2.2 in order to fix the
ability to combine IN parameters in a
cql call and get proper results. But I
did leave one environment set to GeoMesa
1.2.0.
What we noticed
recently when finally hooking paging is
that the WFS calls (version 2.0.0) we
issued would return the correct number
of results given the CQL_FILTER (that
would hold a bbox as well as other cql
filters) but the value in numberMatched
was the total number of features for the
feature type specified not the number of
results that would match the
CQL_FILTER. So we would indicate there
were more pages available for the user
than actual number of results.
I also tried testing
where I just sent a BBOX parameter
instead of combining that with the
cql_filter and it was behaving the
same. WFS responses hooked to GeoMesa
1.2.0 indicated the correct number but
WFS responses hooked to GeoMesa 1.2.2
were reporting the total number of
features for the specified feature type
not the number of features that matched
the bbox specified for that feature
type.
I looked at the
queries logged in accumulo and they
looked the same on quick review. I
also looked at the call logged by
GeoServer and it looked the same as
well.
I was trying to find
where in the code it set the
numberMatched field for the WFS 2.0.0
response but I hadn’t found that yet.
I can upgrade the old
environment but I believe once I do that
it will behave the same and no longer
give correct numberMatched values.
I compared JDK
versions of the servers, geoserver
versions, that they both have native JAI
and it all matches. Also both appear to
use the same version of GeoTools 14.1.
The only difference I have found so far
is GeoMesa version. Has anyone noticed
this issue yet? Is there a fix for it?
Thanks,
Diane
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users
|