Closed Bug 480340 Opened 15 years ago Closed 11 years ago

Places DB sample data set creation for testing

Categories

(Firefox :: Bookmarks & History, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: ddahl, Assigned: ddahl)

References

()

Details

(Keywords: dev-doc-needed)

Attachments

(4 files, 19 obsolete files)

10.11 KB, patch
Details | Diff | Splinter Review
65.51 KB, patch
anodelman
: review+
Details | Diff | Splinter Review
47.23 KB, application/octet-stream
anodelman
: review+
Details
45.50 KB, patch
bhearsum
: review+
ddahl
: feedback+
Details | Diff | Splinter Review
Create python scripts to generate Places DBs with various characteristics such as "many visits within the same domain", "visits across many domains", "many tags", "many bookmarks", etc. 

Create JS bookmarklet/console script to harvest statistics from Places db.
Assignee: nobody → ddahl
This is working on my machine, posting so adw can get it running as well. We may yets need to create a VirtualEnv for automated usage
Attached file Places stats generator (obsolete) —
First draft JS to generate stats on the user's Places database.  Copy and paste into your JS console.  Outputs stats to the console in JSON for easy parsing later.
Attached file Places stats generator (obsolete) —
With source code comments about each piece collected and dates output as nice strings.
Attachment #364399 - Attachment is obsolete: true
Attached file Places stats generator (obsolete) —
Now computes livemark container and livemark child counts, per Dietrich's email.
Attachment #364419 - Attachment is obsolete: true
nice, some comment on the receiving page
legend is vertical while results are horizontal, for every column could be available a sort of tooltip with a description and the query we run.
and column headers should be always visible also with many entries
Attached file Places stats generator (obsolete) —
temp tables handled correctly.
Attachment #364888 - Attachment is obsolete: true
Attached file Places stats generator (obsolete) —
Now with an alert on completion!
Attachment #364954 - Attachment is obsolete: true
Attached file Places stats generator (obsolete) —
Doesn't phone home anymore.  To be used with new site.
Attachment #364972 - Attachment is obsolete: true
Attached file Places stats generator (obsolete) —
Attachment #365299 - Attachment is obsolete: true
Attached file Places stats generator (obsolete) —
Attachment #365322 - Attachment is obsolete: true
we should wrap it up in a ubiquity command
WIP #2 added more information about the command in the "preview", updated the url to https://places-stats.mozilla.com
Attachment #365689 - Attachment is obsolete: true
I see that you're getting information about all the add-ons a user has installed, but you're not collecting which add-ons are actually enabled or not. I think you should probably get that data, too. Breakpad collects that, for instance.
(In reply to comment #13)
Yeah, two reasons for that:
1) I couldn't figure out how to do it. :\ If anyone knows, holla back.
2) People can disable an extension after using it for a long time.  I imagined we would come across a scenario like this:  We get some stats that bear the signature of a certain add-on.  The user(s) has that add-on installed but disabled.  So we totally ignore the disabled status.  Or, not even that.  We come across some stats with weird profiles.  Which extension is the cause?  Some are disabled, but maybe they were enabled last week.  So again we assume they were each enabled at one point and ignore disabled status.
(In reply to comment #14)
I'm in support of (2).  If it's installed, it had it's impact on the db at some point likely.
Updated the python generator code - added glue to allow https fetch of JSON stats data live from the https://places-stats.mozilla.com/stats/ site
Attachment #364398 - Attachment is obsolete: true
According to sdwilsh:

Add moz_keywords 
Add moz_inputhistory

to the data generator
Attachment #365326 - Attachment is obsolete: true
added keywords support, next inputhistory
Attachment #366916 - Attachment is obsolete: true
I plan on setting up my mac mini to run the generator nightly, perhaps pushing the resulting places.sqlite to the intranet.
Attachment #367118 - Attachment is obsolete: true
Blocks: 489513
Attachment #367263 - Attachment is obsolete: true
Attached patch generator update (obsolete) — Splinter Review
Attachment #374173 - Attachment is obsolete: true
Attached patch generator update (obsolete) — Splinter Review
generate.sh
Attachment #374525 - Attachment is obsolete: true
Assignee: ddahl → nobody
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
Hardware: x86 → All
Need to add a date updater script to use daily to keep the data in the db "fresh"
IMHO advising users to paste code into the error console's command line isn't the best of ideas, more in bug 491243 comment 11.
Depends on: 498820
Cleaned up date handling and added ability to update all dates in the generated db via builddb/increment_dates.py - relies on existing env vars of the generate script. Not Python 2.6 compatible due to timeout bug in httplib2
Attachment #374526 - Attachment is obsolete: true
Forgot to update the places schema to the current 3.5 version
Attachment #385436 - Attachment is obsolete: true
The generated places dbs should be places_generated_max.sqlite, places_generated_avg.sqlite, places_generated_min.sqlite

The command line args look like this 

python builddb/generator.py -i avg

python builddb/generator.py -i min

python builddb/generator.py -i max
average generation failed due to floating point number to int type coersion.
Attachment #385483 - Attachment is obsolete: true
Now we know when the generator/date increment script was last run - so the date_increment.py will calculate how many days to roll the dates forward.
Attachment #385506 - Attachment is obsolete: true
Attachment #387327 - Flags: review?(anodelman)
Zipped it for cvs users:)
Attachment #387328 - Flags: review?(anodelman)
I wonder if writing the queries to a file, and reading that file as a single transaction would really speed up the initial creation step? Would the overhead of writing to the file then reading back the file and executing all of the inserts be slower than the current implementation via django's ORM?
adding dev-doc-needed, please create a wiki page to explain how to use it and which prerequisites you need, i spent 1 hour to discover all prerequisites needed on ubuntu (django, httpLib2, simpleJson python packages are needed, plus a bunch of env vars). It is finally working, but was not straight-forward as i was expecting :) also maybe a script to setup the env variables would help if this is goingto be a general purpose db generator.
Keywords: dev-doc-needed
can this be closed?
This still needs to be landed somewhere, as far as I know?
Assignee: nobody → anodelman
Attachment #387327 - Flags: review?(anodelman) → review+
Attachment #387328 - Flags: review?(anodelman) → review+
Assignee: anodelman → nobody
Assignee: nobody → ddahl
alice over irc said it could be checked into the talos repo, if it needs any tweaks I can look it over first.
Attachment #453234 - Flags: feedback?(ddahl)
Comment on attachment 453234 [details] [diff] [review]
[checked in]add generator code to buildfarm/utils

This looks like it's already been used a few times. Don't think it makes sense for me to review it. Welcome to stampy town.
Attachment #453234 - Flags: review?(bhearsum) → review+
Comment on attachment 453234 [details] [diff] [review]
[checked in]add generator code to buildfarm/utils

Looks good to me. I assume this is exactly the code you are currently using in production?
Attachment #453234 - Flags: feedback?(ddahl) → feedback+
Comment on attachment 453234 [details] [diff] [review]
[checked in]add generator code to buildfarm/utils

changeset:   650:455404252ab3
Attachment #453234 - Attachment description: add generator code to buildfarm/utils → [checked in]add generator code to buildfarm/utils
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: