[mender-cli] "Not able to determine users cache dir" error

Hi there,

I am posting here since I cannot manage to find an answer searching on the forum so far.

I am using mender-cli to interface w/ our 2.4 on-premises Mender server. It is being integrated into a Python microservice which uses subprocess.run() to call it (we are looking into using the APIs directly).

Using the 1.4.0 version, I am seeing the following error being returned when called from Python code:

FAILURE: Not able to determine users cache dir

subprocess.run() uses a command array as follows:

/usr/local/bin/mender-cli login --server https://mender-api.fqdn --username user@email.address --password XXX

I am unable however to reproduce the error using an interactive shell (both the microservice and the interactive shell being run as root). The same command logins fine when called manually.

I am seeing here the issue on v1.4.0 (downloaded from the doc link) but not on a freshly built binary from the 1.4.x nor HEAD github branches. I am not sure what has changed here, esp for 1.4.x looking at the changelog.

So here are my questions:

  1. Is there a way to know which commit 1.4.0 was built against?
  2. I see that 1.5.0 is out but the doc does not mention building from source anymore (contrary to the 2.4 ones. Is that not supported anymore?

Thanks much for your help!

Thank you for the report.

I will have a quick look at it now :slight_smile:

@vrubiolo It looks to me like it is an environment error. It stems from the cli program not finding the $HOME variable, or the $USER var.

Can you try it with this set in your Python wrapper?

Not the best error message imo, I will make an amend to it now I think.

@oleorhagen: thanks for looking into this.

The variables you mentioned are correctly set, here is the associated line in the Python code to check for that:

self.logger.info('Plugin running as {}'.format(pwd.getpwuid(os.getuid())))

and the result at runtime:

Plugin running as pwd.struct_passwd(pw_name='root', pw_passwd='x', pw_uid=0, pw_gid=0, pw_gecos='root', pw_dir='/root', pw_shell='/bin/bash')

Also, as I said, switching mender-cli binaries (from 1.4.0 to the 1.4.x branch build) makes the issue disappear (with the same Python code).

I also built the associated Go code that is within mender-cli as a standalone binary (I cannot attach it though, only images are allowed seemingly) and it does not error out. I can provide it if you have another way to upload source code.

I agree with you about the error message, it should be clearer that it could not get either $HOME or $USER.

If you are able to tell me which commit 1.4.0 is built against or if you can provide an instrumented mender-cli binary, I can try to provide more information too

I see. That is interesting indeed. I was pretty sure this would be the environment variables missing.

From what I gather 1.4.0 should be: 9b230ccbb4b344cdb69a07348fe024013c0fae36.

I’m going to have a look into which Go version the released binary is actually built with in the meantime.

Also, building from source is still very much supported. I forgot that part :slight_smile:

I’m not sure why this was removed.

@vrubiolo I have not been able to reproduce this.

I downloaded the 1.4.0 version from our doc site, and ran this:

#!/usr/bin/python3

import subprocess
import sys


subprocess.run(["./mender-cli", "login", "--server", "https://hosted.mender.io", "--username", "ole.orhagen@northern.tech", "--password", "xxxxxx"])

and it ran just fine.

Could you provide me something I can try and reproduce this with?

Let me check on my end how I can narrow a testcase down for you.

In the meantime, I have rebuilt mender-cli from 9b230ccbb4b344cdb69a07348fe024013c0fae36 and the problem does not show up using this binary (I am using Go 1.14.6 on Fedora 32).

@vrubiolo the released mender-cli binary is built with go 1.11. You can try and use the one built with the Dockerfile in the repo, and see if you can reproduce then?

Thanks for the suggestion. I can reproduce the issue with the Docker build too. Here is a build log:

$ git log -1
commit 9b230ccbb4b344cdb69a07348fe024013c0fae36
Merge: 257192b 9c24c88
Author: oleorhagen <ole.orhagen@northern.tech>
Date:   Fri May 8 11:10:33 2020 +0200

    Merge pull request #58 from oleorhagen/QA-161
    
    Qa 161
Sending build context to Docker daemon  13.98MB
$ docker build .
Step 1/9 : FROM golang:1.11-alpine3.9 as builder
 ---> 991f9ef3b182
Step 2/9 : RUN apk update && apk add make git
 ---> Running in a9afcab8fc37
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
v3.9.6-75-gfa811e8d71 [http://dl-cdn.alpinelinux.org/alpine/v3.9/main]
v3.9.6-68-gb2b6f289a8 [http://dl-cdn.alpinelinux.org/alpine/v3.9/community]
OK: 9778 distinct packages available
(1/7) Installing nghttp2-libs (1.35.1-r2)
(2/7) Installing libssh2 (1.9.0-r1)
(3/7) Installing libcurl (7.64.0-r4)
(4/7) Installing expat (2.2.8-r0)
(5/7) Installing pcre2 (10.32-r1)
(6/7) Installing git (2.20.4-r0)
(7/7) Installing make (4.2.1-r2)
Executing busybox-1.29.3-r10.trigger
OK: 21 MiB in 22 packages
Removing intermediate container a9afcab8fc37
 ---> 93c013a6c396
Step 3/9 : RUN mkdir -p /go/src/github.com/mendersoftware/mender-cli
 ---> Running in f82c67c25ff0
Removing intermediate container f82c67c25ff0
 ---> 1e9b21990097
Step 4/9 : WORKDIR /go/src/github.com/mendersoftware/mender-cli
 ---> Running in 4f99002b818d
Removing intermediate container 4f99002b818d
 ---> c5e928295042
Step 5/9 : ADD ./ .
 ---> 7f761e18feb9
Step 6/9 : RUN make build
 ---> Running in 89371b14c96f
CGO_ENABLED=0 go build -ldflags "-X main.Version=1.4.0"  
Removing intermediate container 89371b14c96f
 ---> f6f69b05bbe2
Step 7/9 : FROM busybox
 ---> 6858809bf669
Step 8/9 : COPY --from=builder /go/src/github.com/mendersoftware/mender-cli/mender-cli /
 ---> afd833db4b2a
Step 9/9 : ENTRYPOINT ["/mender-cli"]
 ---> Running in 9d1449b37e27
Removing intermediate container 9d1449b37e27
 ---> 21bca626b378
Successfully built 21bca626b378

I then pulled the binary from the container via docker cp and put it on my server where I can also reproduce the issue.

As for a testcase, I am doing some experiments with Koji, the Fedora build system and the Python code is running within Koji as a plugin itself so I am afraid the setup might be more complex to reproduce.

Would you have a way to pass down an instrumented mender-cli binary instead?

@vrubiolo how about you update to Go 1.15 in the Dockerfile, and build and run. Does the same thing happen?
If it does not, I think we should just bump our version, and save us the digging into an (unsupported/oold?) golang version

Thanks for the suggestion. Let me look into this, I should have an answer at the end of the week.

1 Like

Hi again,

Ok I tried bumping the Go version used in the Dockerfile and rebuilt:

diff --git a/Dockerfile b/Dockerfile
index 7ff18dd..7f58159 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,4 +1,4 @@
-FROM golang:1.11-alpine3.9 as builder
+FROM golang:1.15.2-alpine3.12 as builder
 RUN apk update && apk add make git
 RUN mkdir -p /go/src/github.com/mendersoftware/mender-cli
 WORKDIR /go/src/github.com/mendersoftware/mender-cli
@@ -7,4 +7,4 @@ RUN make build
 
 FROM busybox
 COPY --from=builder /go/src/github.com/mendersoftware/mender-cli/mender-cli /

I have hit an issue at login time about the server certificates:

FAILURE: POST /auth/login request failed: Post "https://mender-api.XXX/api/management/v1/useradm/auth/login": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0

This is a Mender 2.4 server. Passing GODEBUG=x509ignoreCN=0 into the environment allows to bypass the error.

So I confirm that, unless the GODEBUG option significantly changes the mender-cli behavior, the issue appears to be fixed by switching to Go 1.15.

The certificate issue will probably have to be addressed though because the updated binaries will triggers certificate errors otherwise (just as I saw)

1 Like

Alright, great! Thanks @vrubiolo I will bump the version, and then have a look at the error.

Cool! Let me know if you need more information from me :+1:

For the record, here is the Go change triggering the error: https://go-review.googlesource.com/c/go/+/231379, as well as a thread about AWS RDB consequences (which were fixed): https://github.com/golang/go/issues/39568. It looks like the Mender certs might need to be updated to use SANs instead of CNs

@vrubiolo sorry for the late response, just a lot going on.

Nice investigation, thanks! I’ve opened a PR for bumping the Go version in the first iteration. I will look into the certificate issue now.

PR: https://github.com/mendersoftware/mender-cli/pull/75

No worries, and thanks for reporting back w/ the link to the PR.
Are you taking the chance to also improve the error message as well?