A forced reboot of the DCImanager 6 platform can corrupt the Clickhouse database (DB), and the clickhouse_server container will stop working.

As a result, the server statistics will not be displayed and an error will appear in the interface: Error 10106, Graphite request error, Got response code 503.

Diagnostics


  1.  Connect to the server with the platform via SSH.
  2. Connect to the clickhouse_server container

    docker exec -it clickhouse_server sh
    BASH
  3. Check for the ClickHouse init process failed error in the clickhouse-server.log: 

    grep 'ClickHouse init process failed'  /var/log/clickhouse-server/clickhouse-server.log
    CODE

    If the response indicates errors, it is due to a corrupt Clickhouse database. Example of response: 

    ClickHouse init process failed.
    BASH
  4. Check the clickhouse-server.err.log in real time:

    tail -F /var/log/clickhouse-server/clickhouse-server.err.log
    BASH

    If the Clickhouse database is corrupted, it will display errors like this: 

    2023.02.07 08:42:05.203655 [ 115 ] {} <Error> dci.graphite (ec566f01-a447-406f-b275-92a2b3cd85ab): Detaching broken part /var/lib/clickhouse/store/ec5/ec566f01-a447-406f-b275-92a2b3cd85ab/202212_43911_57173_11575 (size: 0.00 B). If it happened after update, it is likely because of backward incompatibility. You need to resolve this manually
    2023.02.07 08:42:05.211909 [ 115 ] {} <Error> dci.graphite (ec566f01-a447-406f-b275-92a2b3cd85ab): while loading part 202212_43911_57178_11580 on path store/ec5/ec566f01-a447-406f-b275-92a2b3cd85ab/202212_43911_57178_11580: Code: 27. DB::ParsingException: Cannot parse input: expected 'columns format version: 1\n' at end of stream. (CANNOT_PARSE_INPUT_ASSERTION_FAILED), Stack trace (when copying this message, always include the lines below)
    BASH

Solution


  1. Run the database recovery:

    docker exec -it clickhouse_server touch /var/lib/clickhouse/flags/force_restore_data
    BASH
  2. Check the clickhouse-server.err.log for errors like:  

    2023.02.08 03:19:33.367657 [ 546 ] {7d518126-889d-4c64-83bf-f87356dc802a} <Error> DynamicQueryHandler: Cannot send exception to client: Code: 24. DB::Exception: Cannot write to ostream at offset 280. (CANNOT_WRITE_TO_OSTREAM)
    BASH

    You can check it with the command:

    cat /var/log/clickhouse-server/clickhouse-server.err.log | grep 'Cannot write to ostream'
    BASH
  3. If the response displays errors, restart the container with the command: 

    docker stop clickhouse_server; docker start clickhouse_server
    BASH