Pose Detection in Android with ML Kit & Jetpack Compose | Real-time Pose Skeleton
Pose detection is an exciting field in mobile AI that enables real-time analysis of human body movements. In this article, we will explore how to implement pose detection in an Android application using Google’s ML Kit and Jetpack Compose. We will also visualise the detected pose as a skeleton overlay on a camera preview.
Prerequisites
Before diving in, ensure you have the following:
- Android Studio installed
- A basic understanding of Jetpack Compose
- A physical Android device (pose detection might not work properly on an emulator)
What is Pose Detection?
Pose detection identifies human body landmarks such as joints and key points (e.g., shoulders, elbows, knees). It is commonly used in fitness tracking, augmented reality (AR), and gesture-based interactions.
Why ML Kit for Pose Detection?
ML Kit is a powerful set of machine learning tools provided by Google. It offers on-device pose detection with:
- Real-time performance
- No internet requirement
- Easy integration
Setting Up ML Kit Pose Detection
To use ML Kit’s pose detection in your Android app, add the dependency in your build.gradle
file:
[versions]
poseDetectionAccurate = "18.0.0-beta5"
poseDetection = "18.0.0-beta5"
[libraries]
pose-detection-accurate = { group = "com.google.mlkit", name = "pose-detection-accurate", version.ref = "poseDetectionAccurate" }
pose-detection = { group = "com.google.mlkit", name = "pose-detection", version.ref = "poseDetection" }
implementation(libs.pose.detection.accurate)
implementation(libs.pose.detection)
CameraX Setup
We use CameraX to access the camera feed and process frames for pose detection. Add the dependencies:
[versions]
cameraxVersion = "1.4.1"
[libraries]
camera-core = { group = "androidx.camera", name = "camera-core", version.ref = "cameraxVersion" }
camera-camera2 = { group = "androidx.camera", name = "camera-camera2", version.ref = "cameraxVersion" }
camera-view = { group = "androidx.camera", name = "camera-view", version.ref = "cameraxVersion" }
camera-lifecycle = { group = "androidx.camera", name = "camera-lifecycle", version.ref = "cameraxVersion" }
Implementing Pose Detection
- Initialize CameraX: We start by setting up CameraX to capture frames for processing.
- Process Frames with ML Kit: Convert
ImageProxy
toInputImage
and feed it to ML Kit’s Pose Detector. - Draw Pose Skeleton: Use Jetpack Compose’s
Canvas
to overlay the detected pose on the camera preview.
Code Implementation
1. Setting up CameraX
Before using CameraX, you need to declare the necessary permissions in your AndroidManifest.xml
file:
<uses-feature
android:name="android.hardware.camera"
android:required="true" />
<uses-permission android:name="android.permission.CAMERA" />
This ensures that the application has access to the device’s camera and that it is required for the app to function.
Below is the Kotlin implementation for requesting camera permissions, handling user responses, and setting up the camera preview using Jetpack Compose.
class MainActivity : ComponentActivity() {
private val cameraPermissionRequest =
registerForActivityResult(ActivityResultContracts.RequestPermission()) { isGranted ->
if (isGranted) {
setCameraPreview()
} else {
openPermissionSettings()
}
}
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
requestedOrientation = ActivityInfo.SCREEN_ORIENTATION_SENSOR_LANDSCAPE
checkCameraPermission()
enableEdgeToEdge()
}
private fun checkCameraPermission() {
when (PackageManager.PERMISSION_GRANTED) {
ContextCompat.checkSelfPermission(
this,
android.Manifest.permission.CAMERA,
), -> {
setCameraPreview()
}
else -> {
cameraPermissionRequest.launch(android.Manifest.permission.CAMERA)
}
}
}
private fun setCameraPreview() {
setContent {
PoseDetectionTheme {
Scaffold(
modifier = Modifier.fillMaxSize(),
) { innerPadding ->
CameraScreen(
modifier = Modifier.padding(innerPadding)
)
}
}
}
}
private fun openPermissionSettings() {
Intent(ACTION_APPLICATION_DETAILS_SETTINGS).also {
val uri = Uri.fromParts("package", packageName, null)
it.data = uri
startActivity(it)
}
}
}
Permission Handling:
- The app checks for camera permission using
ContextCompat.checkSelfPermission
. - If the permission is granted, it proceeds to
setCameraPreview()
. - Otherwise, it requests permission using
ActivityResultContracts.RequestPermission()
. - If denied,
openPermissionSettings()
prompts the user to manually enable the permission in settings.
Setting Up the Camera Preview:
setCameraPreview()
initialises Jetpack Compose UI and callsCameraScreen
.- The
Scaffold
component ensures proper layout handling.
Handling Orientation:
requestedOrientation = ActivityInfo.SCREEN_ORIENTATION_SENSOR_LANDSCAPE
ensures landscape mode.
2. Processing Frames with ML Kit
Processing frames with ML Kit’s Pose Detection API allows real-time analysis of human body movements. In this article, we will break down the implementation of frame processing using ML Kit and explain how it works step by step.
Below is the Kotlin function that processes camera frames, detects human pose landmarks, and updates UI elements accordingly.
@OptIn(ExperimentalGetImage::class)
fun processImageProxy(
imageProxy: ImageProxy,
poseLandmarks: SnapshotStateList<PoseLandmark>,
imageWidth: MutableState<Int>,
imageHeight: MutableState<Int>,
) {
val mediaImage = imageProxy.image ?: return
val image = InputImage.fromMediaImage(mediaImage, imageProxy.imageInfo.rotationDegrees)
val options = PoseDetectorOptions.Builder()
.setDetectorMode(PoseDetectorOptions.STREAM_MODE)
.setPreferredHardwareConfigs(PoseDetectorOptions.CPU_GPU)
.build()
val poseDetector = PoseDetection.getClient(options)
poseDetector.process(image)
.addOnSuccessListener { pose ->
poseLandmarks.clear()
poseLandmarks.addAll(pose.allPoseLandmarks)
imageWidth.value = mediaImage.width
imageHeight.value = mediaImage.height
}
.addOnFailureListener { e ->
Log.e("PoseDetection", "Detection failed", e)
}
.addOnCompleteListener {
imageProxy.close()
}
}
Extracting Image Data from ImageProxy
imageProxy.image ?: return
ensures that the image is valid before processing.InputImage.fromMediaImage(mediaImage, imageProxy.imageInfo.rotationDegrees)
convertsmediaImage
into a format compatible with ML Kit, applying necessary rotations.
Configuring the Pose Detector
- The
PoseDetectorOptions.Builder()
is used to define the detection parameters:
setDetectorMode(PoseDetectorOptions.STREAM_MODE)
: Enables continuous pose detection for real-time applications.
setPreferredHardwareConfigs(PoseDetectorOptions.CPU_GPU)
: Allows the detector to leverage both CPU and GPU for better performance.
PoseDetection.getClient(options)
: Creates an instance of the pose detector with the specified options.
Processing the Image with ML Kit
poseDetector.process(image)
runs the pose detection on the input image.addOnSuccessListener { pose -> ... }
: If successful, it extracts all detected landmarks and updatesposeLandmarks
.imageWidth.value = mediaImage.width
andimageHeight.value = mediaImage.height
store the dimensions of the processed image.addOnFailureListener { e -> Log.e("PoseDetection", "Detection failed", e) }
: Handles errors if pose detection fails.addOnCompleteListener { imageProxy.close() }
: Ensures that theImageProxy
is properly closed after processing to prevent memory leaks.
3. Drawing Pose Skeleton in Jetpack Compose
Pose detection is a powerful feature that allows us to analyze human movement using machine learning. In this article, we will explore how to visually represent detected pose landmarks in an Android application using Jetpack Compose’s Canvas
component. The following Kotlin code provides an implementation of drawing pose landmarks and their connections as an overlay on the screen.
@Composable
fun PoseOverlay(
poseLandmarks: List<PoseLandmark>,
imageWidth: Int,
imageHeight: Int,
canvasWidth: Float,
canvasHeight: Float,
) {
val scaleX = canvasWidth / imageWidth
val scaleY = canvasHeight / imageHeight
Canvas(modifier = Modifier.fillMaxSize()) {
for (landmark in poseLandmarks) {
val adjustedX = landmark.position.x * scaleX
val adjustedY = landmark.position.y * scaleY
drawCircle(
color = Color.Red,
radius = 8f,
center = Offset(adjustedX, adjustedY)
)
}
val connections = listOf(
PoseLandmark.LEFT_EYE to PoseLandmark.RIGHT_EYE,
PoseLandmark.LEFT_EYE to PoseLandmark.LEFT_EAR,
PoseLandmark.RIGHT_EYE to PoseLandmark.RIGHT_EAR,
PoseLandmark.NOSE to PoseLandmark.LEFT_EYE,
PoseLandmark.NOSE to PoseLandmark.RIGHT_EYE,
PoseLandmark.NOSE to PoseLandmark.LEFT_MOUTH,
PoseLandmark.NOSE to PoseLandmark.RIGHT_MOUTH,
PoseLandmark.LEFT_SHOULDER to PoseLandmark.RIGHT_SHOULDER,
PoseLandmark.LEFT_SHOULDER to PoseLandmark.LEFT_HIP,
PoseLandmark.RIGHT_SHOULDER to PoseLandmark.RIGHT_HIP,
PoseLandmark.LEFT_HIP to PoseLandmark.RIGHT_HIP,
PoseLandmark.LEFT_SHOULDER to PoseLandmark.LEFT_ELBOW,
PoseLandmark.LEFT_ELBOW to PoseLandmark.LEFT_WRIST,
PoseLandmark.RIGHT_SHOULDER to PoseLandmark.RIGHT_ELBOW,
PoseLandmark.RIGHT_ELBOW to PoseLandmark.RIGHT_WRIST,
PoseLandmark.LEFT_WRIST to PoseLandmark.LEFT_INDEX,
PoseLandmark.LEFT_WRIST to PoseLandmark.LEFT_PINKY,
PoseLandmark.LEFT_WRIST to PoseLandmark.LEFT_THUMB,
PoseLandmark.RIGHT_WRIST to PoseLandmark.RIGHT_INDEX,
PoseLandmark.RIGHT_WRIST to PoseLandmark.RIGHT_PINKY,
PoseLandmark.RIGHT_WRIST to PoseLandmark.RIGHT_THUMB,
PoseLandmark.LEFT_HIP to PoseLandmark.LEFT_KNEE,
PoseLandmark.LEFT_KNEE to PoseLandmark.LEFT_ANKLE,
PoseLandmark.RIGHT_HIP to PoseLandmark.RIGHT_KNEE,
PoseLandmark.RIGHT_KNEE to PoseLandmark.RIGHT_ANKLE,
PoseLandmark.LEFT_ANKLE to PoseLandmark.LEFT_HEEL,
PoseLandmark.LEFT_ANKLE to PoseLandmark.LEFT_FOOT_INDEX,
PoseLandmark.RIGHT_ANKLE to PoseLandmark.RIGHT_HEEL,
PoseLandmark.RIGHT_ANKLE to PoseLandmark.RIGHT_FOOT_INDEX
)
for ((start, end) in connections) {
val startLandmark = poseLandmarks.find { it.landmarkType == start }
val endLandmark = poseLandmarks.find { it.landmarkType == end }
if (startLandmark != null && endLandmark != null) {
val startX = startLandmark.position.x * scaleX
val startY = startLandmark.position.y * scaleY
val endX = endLandmark.position.x * scaleX
val endY = endLandmark.position.y * scaleY
drawLine(
color = Color.Green,
strokeWidth = 4f,
start = Offset(startX, startY),
end = Offset(endX, endY)
)
}
}
}
}
Explanation
Composable Function: PoseOverlay
@Composable
fun PoseOverlay(
poseLandmarks: List<PoseLandmark>,
imageWidth: Int,
imageHeight: Int,
canvasWidth: Float,
canvasHeight: Float,
) {
- This is a
@Composable
function namedPoseOverlay
, which means it can be used within other Jetpack Compose UI components. - It takes a list of detected pose landmarks (
poseLandmarks
) and the dimensions of the input image (imageWidth
,imageHeight
). - The
canvasWidth
andcanvasHeight
represent the size of the UI where the landmarks will be drawn.
Scaling Factor Calculation
val scaleX = canvasWidth / imageWidth
val scaleY = canvasHeight / imageHeight
Since the input image size may differ from the displayed canvas size, we calculate scaling factors (scaleX
, scaleY
) to correctly map the pose landmarks onto the screen.
Drawing Landmarks
Canvas(modifier = Modifier.fillMaxSize()) {
for (landmark in poseLandmarks) {
val adjustedX = landmark.position.x * scaleX
val adjustedY = landmark.position.y * scaleY
Canvas
is a Jetpack Compose component that allows us to draw custom graphics.- We iterate through the list of detected landmarks (
poseLandmarks
). - The x and y coordinates of each landmark are adjusted according to the scaling factors.
drawCircle(
color = Color.Red,
radius = 8f,
center = Offset(adjustedX, adjustedY)
)
}
- Each landmark is represented as a red circle (
Color.Red
) with a radius of8f
at the corresponding position.
Defining Connections Between Landmarks
val connections = listOf(
PoseLandmark.LEFT_EYE to PoseLandmark.RIGHT_EYE,
PoseLandmark.LEFT_EYE to PoseLandmark.LEFT_EAR,
PoseLandmark.RIGHT_EYE to PoseLandmark.RIGHT_EAR,
PoseLandmark.NOSE to PoseLandmark.LEFT_EYE,
PoseLandmark.NOSE to PoseLandmark.RIGHT_EYE,
PoseLandmark.NOSE to PoseLandmark.LEFT_MOUTH,
PoseLandmark.NOSE to PoseLandmark.RIGHT_MOUTH,
// **Torso**
PoseLandmark.LEFT_SHOULDER to PoseLandmark.RIGHT_SHOULDER,
PoseLandmark.LEFT_SHOULDER to PoseLandmark.LEFT_HIP,
PoseLandmark.RIGHT_SHOULDER to PoseLandmark.RIGHT_HIP,
PoseLandmark.LEFT_HIP to PoseLandmark.RIGHT_HIP,
// **Arms**
PoseLandmark.LEFT_SHOULDER to PoseLandmark.LEFT_ELBOW,
PoseLandmark.LEFT_ELBOW to PoseLandmark.LEFT_WRIST,
PoseLandmark.RIGHT_SHOULDER to PoseLandmark.RIGHT_ELBOW,
PoseLandmark.RIGHT_ELBOW to PoseLandmark.RIGHT_WRIST,
// **Hands & Fingers (Basic)**
PoseLandmark.LEFT_WRIST to PoseLandmark.LEFT_INDEX,
PoseLandmark.LEFT_WRIST to PoseLandmark.LEFT_PINKY,
PoseLandmark.LEFT_WRIST to PoseLandmark.LEFT_THUMB,
PoseLandmark.RIGHT_WRIST to PoseLandmark.RIGHT_INDEX,
PoseLandmark.RIGHT_WRIST to PoseLandmark.RIGHT_PINKY,
PoseLandmark.RIGHT_WRIST to PoseLandmark.RIGHT_THUMB,
// **Legs**
PoseLandmark.LEFT_HIP to PoseLandmark.LEFT_KNEE,
PoseLandmark.LEFT_KNEE to PoseLandmark.LEFT_ANKLE,
PoseLandmark.RIGHT_HIP to PoseLandmark.RIGHT_KNEE,
PoseLandmark.RIGHT_KNEE to PoseLandmark.RIGHT_ANKLE,
// **Feet**
PoseLandmark.LEFT_ANKLE to PoseLandmark.LEFT_HEEL,
PoseLandmark.LEFT_ANKLE to PoseLandmark.LEFT_FOOT_INDEX,
PoseLandmark.RIGHT_ANKLE to PoseLandmark.RIGHT_HEEL,
PoseLandmark.RIGHT_ANKLE to PoseLandmark.RIGHT_FOOT_INDEX
)
- A list of pairs is defined to represent key skeletal connections.
- Each pair consists of two
PoseLandmark
constants, which indicate the joints to be connected.
Drawing Lines to Connect Landmarks
for ((start, end) in connections) {
val startLandmark = poseLandmarks.find { it.landmarkType == start }
val endLandmark = poseLandmarks.find { it.landmarkType == end }
- This loop iterates through the list of connections.
- We attempt to find the
PoseLandmark
object corresponding to each connection pair.
if (startLandmark != null && endLandmark != null) {
val startX = startLandmark.position.x * scaleX
val startY = startLandmark.position.y * scaleY
val endX = endLandmark.position.x * scaleX
val endY = endLandmark.position.y * scaleY
- If both landmarks are found, their positions are adjusted using the scaling factors.
drawLine(
color = Color.Green,
strokeWidth = 4f,
start = Offset(startX, startY),
end = Offset(endX, endY)
)
}
}
- Each connection is drawn as a green (
Color.Green
) line with a stroke width of4f
.
Running the App
Once everything is set up, run the app on a physical device. You should see real-time pose detection with a skeleton overlay drawn using Jetpack Compose.
Conclusion
In this article, we implemented real-time pose detection in an Android app using ML Kit and Jetpack Compose. We covered setting up CameraX, processing frames with ML Kit, and visualizing the detected pose with Compose’s Canvas
.
🚀 Try it out and customize the skeleton drawing for different use cases!
Resources
You can find the complete source code for this project in my GitHub repository: 👉 GitHub: Pose Detection
If you found this helpful, don’t forget to ⭐ the repository on GitHub.