Fix the bloody importer
#1
The importer is such a major suck in its current state to be frank.
It's so bad that i've considered many times to simply rewrite it.

This should be done
1. Split the sets into actual files, this since SQLite db's can be finicky and prone to corruption. A single exchange/set corrupting is a zero-problem. An entire db with 10+ sets corruption less so. This would also make it around 5 billion times % easier to share sets, manage them, convert to other DB's ... etc etc ... etc.

2. Use free filesharing sites such as transfer.sh to act as a proxy so that not every single user has to spam the exchanges for the same data over and over again.

-

Flow
This is a flow that could work:
a) download gekko_history.txt (basically an index file) that contains e.g. this CSV data:


Code:
EXCHANGE,ASSET,CURRENCY,FROM,TO,FILE
poloniex,NEO,USDT,2016-01-01,2018-04-01,https://transfer.sh/XJjVl/poloniex_neo_usdt_2016-01-01--2018-04-01.tar.gz


b) Check if user request contains the date that is within the csv data.. if so download from "link", untar it etc.


-

Any sort of file service could be used that allows for simple transfers and scripting transfer. transfer.sh is just used as an example since it's one of the simplest that i found myself for this sort of stuff.
  Reply
#2
Quote:1. Split the sets into actual files, this since SQLite db's can be finicky and prone to corruption. A single exchange/set corrupting is a zero-problem. An entire db with 10+ sets corruption less so. This would also make it around 5 billion times % easier to share sets, manage them, convert to other DB's ... etc etc ... etc.

A lot of people experience errors here, I had it happen only once before (around 2 years ago). What is your OS?
  Reply
#3
(04-17-2018, 09:56 AM)askmike Wrote:
Quote:1. Split the sets into actual files, this since SQLite db's can be finicky and prone to corruption. A single exchange/set corrupting is a zero-problem. An entire db with 10+ sets corruption less so. This would also make it around 5 billion times % easier to share sets, manage them, convert to other DB's ... etc etc ... etc.

A lot of people experience errors here, I had it happen only once before (around 2 years ago). What is your OS?

This is highly dependent on SQlite > Host OS (hence you ask). I haven't had much troubles myself but i know that file locking can be somewhat of an issue and can easily make a SQLite db become corrupted/malformed.

I have a lot of OS:es since i run Gekko on 6 computers. Some debian, some ubuntu, some win10 + wsl.

This topic was about solving the architecture of the dataset downloader though.
Another OS doesn't really have much to do with that?
  Reply
#4
If You have problem with imported or dont want wait a lot of time for import You can use my history dumps. More info here: https://github.com/xFFFFF/Gekko-Datasets
I did download full history of some exchanges and importer work fine for me. Gekko is trading bot not webportal, not need advanced databases structures and 200 connections to database. Ofcourse Im talking about native Gekko without any third part softwares.

Sqlite is easy to merge/split and doing what You want by using sql statements. And dont need start another server to use. Its good when using VPS with limited resources. I will develop some simple scripts to split, export, or merge.
My projects [Strategies] [Datasets]
  Reply
#5
(04-17-2018, 03:40 PM)xFFFFF Wrote: If You have problem with imported or dont want wait a lot of time for import You can use my history dumps. More info here: https://github.com/xFFFFF/Gekko-Datasets
I did download full history of some exchanges and importer work fine for me. Gekko is trading bot not webportal, not need advanced databases structures and 200 connections to database. Ofcourse Im talking about native Gekko without any third part softwares.

Sqlite is easy to merge/split and doing what You want by using sql statements. And dont need start another server to use. Its good when using VPS with limited resources. I will develop some simple scripts to split, export, or merge.

I don't have any problems with imported data?

# stuff
Yes, it's great that you have uploaded stuff.
But this "stuff" is basically the of core of the problem.
It's all "stuff" and not specific. Why download 10x pairs if one only want one of these?
It makes no sense.

# connections
I don't see how a single connection all of a sudden would become 200 for no apparent reason.
Also i don't see how a single download has anything to do with a VPS. Any VPS that can download
data would download data as just as good as from any other source.
It would just go 1000x faster. If one got a VPS that can't handle simple downloads maybe it's time to upgrade
to something that costs more then $1/month.

# complexity
The strucuture suggested is hardly advanced nor complex, it's much more simple.
Yes, Gekko is a trading bot. However backtesting is a core part of it (and thus downloading data).

# no way
As far as i know there is currently no way of downloading something specific and merging it automactically.
Or have you created some sort of API that take specific request parameters and thus solves that?
  Reply
#6
My problems with the importer resulted only from internet connection errors or my mistakes. The only error occurs when importing from Kraken, but does not affect the database. Im using Gekko on Ubuntu, Linux Mint and FreeBSD.

I just wrote why in my opinion the database gekko on SQLite is the best solution. If you need more advanced, Gekko support postgresql, mongodb or check this solution with mySQL: https://github.com/jordanmmyers/mb-dev

Have you seen how much database files from the last 60 days have been occupied? Downloading 150 MB is such a big problem? It should take a minute. Exporting 10 pairs from such a file should take about 5-15 seconds.

askmike plans to provide dataset on its premium services. My solution is opensource, open source will never be as good as commercial projects and not everyone has to fit, but everyone can use my work to do it own way.

And in my opinion, in Gekko there are more important things to improve than the importer. For example, trader and portfoliomanager because for this Gekko is used for trading.
My projects [Strategies] [Datasets]
  Reply
#7
What you say literally has nothing to do with this thread.
  Reply
#8
I have to agree that there's something suboptimal about  Gekko datasets and downloads.

I had a few coins worth of data collected.  It was taking several minutes of setup each time I would restart Gekko.  Minutes of heavy disk grinding.  Why?  This is a database.  Must we re-read every record each time before using the database?  That's crazy.

A few moments ago I did the wait for data to load.  Lots of grinding and waiting.

After the usual long delay  it says:

             No Data Found


Damn!
  Reply
#9
(04-17-2018, 09:12 PM)richard Wrote: I have to agree that there's something suboptimal about  Gekko datasets and downloads.

I had a few coins worth of data collected.  It was taking several minutes of setup each time I would restart Gekko.  Minutes of heavy disk grinding.  Why?  This is a database.  Must we re-read every record each time before using the database?  That's crazy.

A few moments ago I did the wait for data to load.  Lots of grinding and waiting.

After the usual long delay  it says:

             No Data Found


Damn!

This doesn't got much to do with this thread though, but these sort of annoyances would be less likely to occur with the proposed solution and architecture.
This since a new download of a pair simply wouldn't take that much time and thus retrying wouldn't be such a issue. Since it would be so much faster one could even completely remove the old set and retry.
  Reply
#10
Tbf if you're a hardcore user of gekko (and you guys mostly are) move away from sqlite and use postgres.
  Reply


Forum Jump:


Users browsing this thread: