Static Analysis Of The DeepSeek Android App

Verse z 10. 2. 2025, 12:53; Constance5304 (Diskuse | příspěvky)

(rozdíl) ← Starší verse | zobrazit současnou versi (rozdíl) | Novější verse → (rozdíl)
Přejít na: navigace, hledání


I carried out a static analysis of DeepSeek, a Chinese LLM chatbot, using version 1.8.0 from the Google Play Store. The goal was to determine potential security and personal privacy concerns.


I've discussed DeepSeek formerly here.


Additional security and personal privacy issues about DeepSeek have actually been raised.


See also this analysis by NowSecure of the iPhone version of DeepSeek


The findings detailed in this report are based simply on fixed analysis. This indicates that while the code exists within the app, there is no definitive proof that all of it is carried out in practice. Nonetheless, the presence of such code warrants examination, specifically given the growing concerns around data privacy, wiki.rolandradio.net surveillance, the prospective misuse of AI-driven applications, and cyber-espionage dynamics between global powers.


Key Findings


Suspicious Data Handling & Exfiltration


- Hardcoded URLs direct information to external servers, raising concerns about user activity tracking, such as to ByteDance "volce.com" endpoints. NowSecure recognizes these in the iPhone app the other day also.
- Bespoke file encryption and information obfuscation approaches are present, with signs that they might be used to exfiltrate user details.
- The app contains hard-coded public keys, instead of counting on the user device's chain of trust.
- UI interaction tracking captures detailed user habits without clear permission.
- WebView manipulation is present, which could enable for the app to gain access to personal external web browser information when links are opened. More details about WebView adjustments is here


Device Fingerprinting & Tracking


A significant part of the evaluated code appears to focus on event device-specific details, which can be used for tracking and fingerprinting.


- The app collects different unique gadget identifiers, including UDID, Android ID, IMEI, IMSI, and provider details.
- System homes, set up plans, and root detection systems recommend possible anti-tampering measures. E.g. probes for the presence of Magisk, a tool that privacy advocates and security scientists use to root their Android gadgets.
- Geolocation and network profiling are present, suggesting prospective tracking capabilities and enabling or disabling of fingerprinting programs by area.
- Hardcoded gadget design lists suggest the application may act in a different way depending upon the detected hardware.
- Multiple vendor-specific services are utilized to draw out extra device details. E.g. if it can not figure out the gadget through basic Android SIM lookup (due to the fact that approval was not given), it tries manufacturer specific extensions to access the exact same details.


Potential Malware-Like Behavior


While no conclusive conclusions can be drawn without vibrant analysis, numerous observed behaviors line up with known spyware and malware patterns:


- The app uses reflection and UI overlays, which might help with unauthorized screen capture or phishing attacks.
- SIM card details, serial numbers, and other device-specific information are aggregated for unidentified purposes.
- The app implements country-based gain access to constraints and "risk-device" detection, suggesting possible monitoring mechanisms.
- The app carries out calls to fill Dex modules, where extra code is loaded from files with a.so extension at runtime.
- The.so files themselves turn around and make extra calls to dlopen(), which can be utilized to .so files. This facility is not usually examined by Google Play Protect and other static analysis services.
- The.so files can be executed in native code, such as C++. Making use of native code includes a layer of complexity to the analysis procedure and obscures the full extent of the app's capabilities. Moreover, native code can be leveraged to more easily intensify benefits, possibly making use of vulnerabilities within the operating system or gadget hardware.


Remarks


While data collection prevails in modern applications for debugging and improving user experience, aggressive fingerprinting raises substantial privacy concerns. The DeepSeek app requires users to visit with a valid email, which need to currently offer sufficient authentication. There is no legitimate factor for the app to strongly collect and transfer special gadget identifiers, IMEI numbers, SIM card details, and other non-resettable system homes.


The extent of tracking observed here goes beyond typical analytics practices, potentially making it possible for persistent user tracking and re-identification across devices. These behaviors, integrated with obfuscation methods and network communication with third-party tracking services, warrant a greater level of analysis from security scientists and users alike.


The work of runtime code packing in addition to the bundling of native code recommends that the app might allow the release and execution of unreviewed, from another location delivered code. This is a serious prospective attack vector. No proof in this report is presented that remotely released code execution is being done, only that the facility for this appears present.


Additionally, the app's approach to discovering rooted gadgets appears extreme for an AI chatbot. Root detection is often warranted in DRM-protected streaming services, where security and material security are vital, or in competitive computer game to prevent unfaithful. However, there is no clear reasoning for such rigorous procedures in an application of this nature, raising further concerns about its intent.


Users and organizations thinking about setting up DeepSeek must understand these prospective risks. If this application is being used within an enterprise or government environment, extra vetting and security controls ought to be enforced before enabling its release on handled gadgets.


Disclaimer: The analysis presented in this report is based on static code evaluation and does not suggest that all identified functions are actively used. Further investigation is needed for conclusive conclusions.