Jobs list of Information Technology & Site Reliability Engineer
-
NEWJob number: JN -092024-177099 Posted: 2026-01-28
SRE(自社サービス)
オープンで建設的な議論を大切にする会社6.8 - 15 million yen Tokyo Information Technology Site Reliability Engineer
- Company overview
- Founded and based in Tokyo, our client specializes in advanced data platform solutions, offering services that optimize data utilization for businesses worldwide. With a dedicated team of 50-100 employees, they develop and operate innovative products that enhance data discovery and management, driving business growth and efficiency across various industries.
- Responsibilities
- 業務内容:
業務内容例データ転送ジョブの実行スケジュールに対し、Kubernetes のノードをより効率的にスケールさせる機能の開発
アプリケーション・インフラ監視体制の課題洗い出しと改善
SLI/SLOの策定・改善とモニタリング
インフラ利用コストと運用コストの最適化
バグの調査とバグを早期発見するための取り組み立案
SOC 2取得に向けたプロダクトセキュリティ対策の計画・実行
CI/CD パイプライン改善
利用者からの問い合わせ調査・サポート
Hideo Shirakigawa
Online / Inhouse -
Job number: JN -122025-198355 Posted: 2025-12-29
SRE Tech Lead
SREチームをリード/フレックスタイム制/充実した福利厚生/英語スキル活かせる8 - 12 million yen Tokyo Information Technology Site Reliability Engineer
- Company overview
- Our client a leading Japanese internet services and e-commerce company, engaging in a wide range of business areas including online shopping malls. We deploy advanced technology and customer-centric services, utilized by users in Japan and abroad. We operate diverse businesses such as e-commerce, finance, and digital content, offering various services including online shopping platforms, cloud services, and internet banking. We also provide point programs and media content to enhance customer engagement. Additionally, we actively expand our business both domestically and internationally, strengthening our competitiveness in the global market.
- Responsibilities
同社は急速な成長を遂げており、サービスの信頼性、拡張性、そして運用効率が極めて重要になっています。
今回募集のポジションは、SREチームを率いて
Webフロントエンドおよびバックエンドシステムの信頼性とパフォーマンスを最大化するために不可欠です。
SREチームリード
として、パブリッククラウドとプライベートクラウドを横断するインフラストラクチャの構築と改善を主導、また製品の品質、デリバリー、信頼性の管理をお任せします。
技術的な観点から当社のビジネス成長を支える重要なポジション
です。
チームを鼓舞し、オペレーショナルエクセレンスの新たな高みへと導く、先見の明のあるあなたからの応募をお待ちしております。
━━━━━━━━━━━━━━━
■Responsibilities
SRE Strategy & Roadmap Development: Define and drive the execution of the SRE strategy and technical roadmap to enhance service reliability, performance, and scalability.
Observability Platform Leadership: Lead the management and improvement of monitoring, alerting, logging, and tracing tools, driving the establishment of optimal observability environments for each product.
Service Quality Definition & Achievement: Define Service Level Objectives (SLOs) and Service Level Agreements (SLAs), and plan/execute improvement activities to achieve them. Drive the adoption and operation of Error Budgets.
Performance & Latency Improvement: Identify bottlenecks in service performance and latency, and direct/oversee the team in proposing and implementing solutions.
Incident Management & Troubleshooting: Act as an incident commander during production outages, leading rapid restoration efforts. Conduct Root Cause Analysis (RCA) and drive the implementation of preventative measures.
Operational Efficiency & Automation: Promote automation of operational processes to reduce toil, building an efficient and scalable operational framework.
Team Management & Development: Provide technical guidance, mentorship, and performance evaluations for SRE team members, contributing to the overall skill enhancement and performance of the team.
Cross-functional Collaboration: Strengthen collaboration with product development teams, infrastructure teams, security teams, and other relevant departments, fostering a DevOps culture and strong cooperative relationships.
■部門概要
インセンティブプラットフォーム部(INPD)は、同社のポイントとクーポンの開発・運用を担っています。
ポイントの価値最大化を目指し、改善を推進し、エコシステムに貢献していきます。
■働く環境
フレックスタイム制
ストックオプション制度や退職金制度など充実した福利厚生
━━━━━━━━━━━━━━━#spotlightjob5
Jonathan Spence
Online / Inhouse
1